Developmental Eye Movement (DEM) Test Norms for Mandarin Chinese-Speaking Chinese Children

The Developmental Eye Movement (DEM) test is commonly used as a clinical visual-verbal ocular motor assessment tool to screen and diagnose reading problems at the onset. No established norm exists for using the DEM test with Mandarin Chinese-speaking Chinese children. This study aims to establish the normative values of the DEM test for the Mandarin Chinese-speaking population in China; it also aims to compare the values with three other published norms for English-, Spanish-, and Cantonese-speaking Chinese children. A random stratified sampling method was used to recruit children from eight kindergartens and eight primary schools in the main urban and suburban areas of Nanjing. A total of 1,425 Mandarin Chinese-speaking children aged 5 to 12 years took the DEM test in Mandarin Chinese. A digital recorder was used to record the process. All of the subjects completed a symptomatology survey, and their DEM scores were determined by a trained tester. The scores were computed using the formula in the DEM manual, except that the “vertical scores” were adjusted by taking the vertical errors into consideration. The results were compared with the three other published norms. In our subjects, a general decrease with age was observed for the four eye movement indexes: vertical score, adjusted horizontal score, ratio, and total error. For both the vertical and adjusted horizontal scores, the Mandarin Chinese-speaking children completed the tests much more quickly than the norms for English- and Spanish-speaking children. However, the same group completed the test slightly more slowly than the norms for Cantonese-speaking children. The differences in the means were significant (P<0.001) in all age groups. For several ages, the scores obtained in this study were significantly different from the reported scores of Cantonese-speaking Chinese children (P<0.005). Compared with English-speaking children, only the vertical score of the 6-year-old group, the vertical-horizontal time ratio of the 8-year-old group and the errors of 9-year-old group had no significant difference (P>0.05); compared with Spanish-speaking children, the scores were statistically significant (P<0.001) for the total error scores of the age groups, except the 6-, 9-, 10-, and 11-year-old age groups (P>0.05). DEM norms may be affected by differences in language, cultural, and educational systems among various ethnicities. The norms of the DEM test are proposed for use with Mandarin Chinese-speaking children in Nanjing and will be proposed for children throughout China.


Introduction
Reading disability, like obesity, has become an important public health problem [1]. Children with poor reading skills include those with dyslexia, those with non-dyslexic reading disabilities, those with lower linguistic cognition and non-linguistic perceptual cognitive processing skills and those with other problems. Children with poor reading skills have reading comprehension difficulties and tend to dislike and avoid reading, resulting in a lack of reading experience that severely influences their ability to acquire knowledge compared with their peers [1,2]. These children cannot establish a solid reading comprehension foundation, which affects their subsequent academic achievement, creating a vicious cycle. Therefore, the early identification of children at risk of developing reading problems is more economical and beneficial than later intervention and treatment.
Visual and auditory processing disorders result in poor reading skills; in particular, visual processing is a prerequisite for completing reading tasks [3]. Eye movements, including saccades, fixations and regressions, are the most important skills in reading [4,5]. Children with developmental dyslexia (DD) exhibit abnormal eye movements, including an increased number of saccades, a long fixation or regression, and irregular regression distances [6]. The relationship between eye movement quality and reading difficulty has been well documented [7][8][9][10]. The morbidity of reading problems in individuals from countries where a phonetic language is spoken ranges from 5 to 10 percent [11][12]. However, to date, there has been no report of the morbidity of reading problems in China. Therefore, the evaluation of eye movements in children could help to identify reading problems at their onset and facilitate a preliminary evaluation that could provide a foundation for further research in the Chinese population.
The Developmental Eye Movement (DEM; Bernell Corp., Mishawaka, IN) test is one of the most common visual-verbal tests for evaluating ocular motor control and rapid automatized naming [13][14][15]. Since its introduction in 1987, this test has been widely used by optometrists [16]. In addition to allowing a thorough evaluation of eye movement functions, including fixation, saccade and regression, this test can evaluate visual information processing during reading. Because of its advantages, it has become a standardized clinical test and is highly recommended for the evaluation of children with reading problems, but not for dyslexic children [16]. However, to date, there is no established norm for the use of the DEM test with Mandarin Chinese-speaking Chinese children. Hence, in this study, we aimed to a) establish normative values for the DEM test for the Mandarin Chinese-speaking population in China and b) compare the obtained values with the published normative values for English- [16], Spanish- [17], and Cantonese-speaking Chinese children [18].
In this study, we aimed to develop normative values for the use of the DEM test to evaluate Mandarin Chinese-speaking children aged 5 to 12 years in Nanjing, China.

Methods Participants
This study was approved by the ethics committee of the Nanjing Maternal and Child Health Hospital of Nanjing Medical University, where the investigation was conducted (2012 (12)). A total of 1,206 children aged 5-6 years old and 3,586 children aged 7-12 years old participated in this study. The random stratified sampling method was used to recruit children from eight kindergartens and eight primary schools in the main urban and suburban areas of Nanjing. The subjects were Chinese-speaking children aged 5 to 12 years. An age/grade-matching criterion was used for each subject. These children shared similar socio-cultural and educational backgrounds, and there was no bias caused by selection from clinical referral populations. Permission and informed consent were obtained from the principals and the children's parents or guardians. The subjects were selected from first-to eighth-grade classrooms according to the following inclusion criteria: 5 to 12 years of age, regular classroom attendance, a near-point visual acuity of 1.0 (decimal scale) at 40 cm, and successful performance on the DEM pretest. The exclusion criteria included the following: a lack of adequate number-naming skills (for children older than 5 years) as determined by the DEM pretest, a neurological disease or physical disability, a diagnosis of mental retardation, a failure to complete the DEM test because of an inability to stay on task even after redirection, and great difficulty with the horizontal task (for example, keeping a finger in one place or becoming completely lost). We used the random-number technique and chose each mantissa of 2, 5 and 8 as our participants. The number of participants excluded from each group were 1, 3, 4, 6, 7, 9, and 0. These exclusions reduced the final sample size from 4,792 to 1,425 (n = 1,425; 697 males and 728 females). To the best of our knowledge, no systematic sample selection biases existed. Data were collected September to December 2012 and March to June 2013.

DEM test
The DEM test comprised a pretest card and three 216×279-mm test cards from the DEM test for Cantonese-speaking children [18]. The test process was simple and proceeded as follows: a child was asked to read the numbers on the test cards aloud in the order in which they appeared, and the examiner recorded the time taken and the errors the child made. The pretest consisted of 10 single-digit numbers separated by equal spacing, and it was used to confirm that the child was able to name simple numbers without difficulty. The two subtest cards (Tests A and B) consisted of 80 numbers and were divided into two groups of 40 single-digit numbers. On each card, the 40 numbers were arranged into two vertical columns of 20 numbers each. Tests A and B served to determine the child's automaticity for reading vertically aligned numbers. The third test card (Test C) consisted of the same 80 numbers arranged in a horizontal array of 16 rows with 5 numbers each. The first and fifth number of each row were aligned down the page; however, the second, third and fourth numbers in each row were randomly spaced.

Procedures [18]
The standard DEM test was administered to each student by two experienced examiners in accordance with the test norms for Cantonese-speaking children [18] and other test instructions [17,19]. The test was conducted in a quiet room in the school. For the pretest, the children were asked to read the 10 single-digit numbers out loud. If the child read all of the numbers correctly in 12 seconds or less, he/she passed the pretest, indicating that he/she could see and read the numbers clearly on the DEM test chart using his/her habitual visual abilities.
After the tests were completed, the following four scores were derived: the "vertical score" (the total time required to complete Tests A and B), the "horizontal score" (the time needed to complete Test C, with an adjustment according to the errors made), the "error score" (the total number of errors made in Test C), and the "ratio score" (the ratio of the horizontal score to the vertical score). Four types of reading errors were possible: substitution, omission, addition, and transposition. Only the "omission" and "addition" errors affected the actual reading time; therefore, the horizontal score was adjusted using the following formula (from the DEM manual): adjusted horizontal score = Test C (time in seconds)×80/(80-omission+addition) [18]. Finally, a vertical score, an adjusted horizontal score, an error score, and a ratio score were obtained for each child.

Reliability and validity of the test
The time reliability test and tester reliability test were conducted separately using remeasurements to evaluate the reliability of our localized DEM test. To evaluate the time reliability of the test, we re-tested a random sample of 180 children (male:female = 1:1.1) approximately two weeks after the first (T1) test. To assess tester reliability, we switched the two testers; they re-tested a random sample of 180 children, and the test results were compared. For the internal consistency validity test, we analyzed the correlation coefficients among the three eye movement indices.

Statistical analyses
The two researchers entered the data using EpiData 3.0 software (EpiData Association, Odense, Denmark). A uniqueness check, a double check, and a logic check were conducted to ensure that the data were completely correct. The data were then analyzed using SPSS 17.0 (SPSS, Chicago, IL). The DEM scores are shown as the mean±SD. Qualitative and count data are shown as percentages. In addition, the χ 2 test was used to compare qualitative or count data. An independent t test was used to compare the DEM indices for the different languages (Mandarin vs Cantonese; Mandarin vs English; Mandarin vs Spanish) for different age groups. The Tukey-Kramer multiple comparison test was used as a post hoc test after ANOVA, and the t test was used for simple comparisons of data between two groups. A P value less than or equal to 0.05 was considered statistically significant.

Study participant information
A total of 1,425 children (n = 1,425; 697 males and 728 females) participated in this study. The participants were recruited from eight kindergartens and eight primary schools in the main urban and suburban areas of Nanjing, China, using a random stratified sampling method. The children were distributed among eight age groups and ranged in age from 5 to 12 years. Two experienced pediatricians conducted the DEM test in Mandarin, thus preliminarily establishing the test norms for children in Nanjing. The participants' male-to-female ratio was 1:1.04. All of the children completed the test, and their complete information was obtained. The demographic data of subjects in all age groups are shown in Table 1.

The mean DEM scores for each age group
Based on the method and formulation described above, we calculated all of the DEM scores for each subject. The mean results are shown in Table 3. The DEM test showed a decrease with age in the four eye movement indices (vertical score, horizontal score, vertical-horizontal ratio and total errors), and significant differences were observed among the indices. This result suggests that the duration of each eye movement index decreased with age (Fig 1, Table 3). In comparison, the DEM scores for the horizontal score and vertical score were similar for the 10-, 11and 12-year-old age groups (Fig 1, Table 3). The ratios and total errors did not significantly different among the 8-, 9-, 10-, 11-and 12-year-old age groups (Fig 1, Table 3). These results indicate that reading speed plateaus when children reach a specific age.

Derivation of DEM norms for Mandarin-speaking children
According to the DEM test manual, clinicians should compare a child's test results with the appropriate norm tables and determine the percentile rank for each score. To construct the DEM norm tables for Cantonese-speaking children, the percentile ranks of each score in each age group were determined according to frequency distribution statistics. The norm tables (Table A, B, C, D, E, F, G and Table H in S1 File) for each age group in our study are shown in Supporting Information.

Comparison of DEM scores for each age group with multinational norms
The current study is larger than the studies of the US and Cantonese (Hong Kong) populations and is similar in size to the study of Spanish-speaking individuals. Our study cohort and procedures are similar to those used to establish the US test norms. We compared our data with those of the DEM test authors [16], a study of DEM norms in Spanish-speaking children [20], and a study of DEM norms in a Cantonese-speaking population (Hong Kong) [18]. The mean values (standard deviation) are listed in Table 4. Most of the vertical and horizontal scores of the Mandarin-speaking children in all age groups were significantly different from those of the English-, Spanish-and Cantonese-speaking children in the other studies (independent t-test; the t1, t2, and t3 values for all age groups are shown in Table 4; P1, P2 and P3<0.001). The results indicated that the means of the vertical and horizontal scores in our study (for the Mandarin Chinese-speaking children) were significantly smaller than those of English-speaking children in all age groups (the t2 values for all age groups are shown in Table 4; P2<0.001 except for the vertical score in the 6-year-old group, which was P2 = 0.081). In addition, the ratio score was not significantly different in the 8-year-old age group when compared with the English-speaking children (t2 = -1.52, P2>0.05, Table 4). The total error scores were not significantly different for the 6-(t3 = 0.99, P3 = 0.323, Table 4), 9-(t3 = 0.47, P3 = 0.320, Table 4), 10-(t3 = -0.90, P3 = 0.372, Table 4) and 11-year-   Table 4), including the vertical-horizontal ratios for the 6-, 7-and 8-year-old groups, all of the test scores for the 9-year-old group (except the total error score), the vertical-horizontal ratio for the 10-year-old group, and the vertical and horizontal scores and the vertical-horizontal ratio for the 11-year-old group. The remaining scores were statistically significant (P1<0.001, Table 4).

Reliability and validity of the DEM test
Evaluating the reliability and validity of the Mandarin Chinese DEM scores provides an effective means of evaluating the use of the DEM test in the Chinese population. In the current study, a retest was conducted to assess the reliability of this test in Nanjing, China. The time reliability test and tester reliability test were conducted separately. To evaluate time reliability, we re-measured a random sample of 180 children during a two-week period. The means and standard deviations of the DEM scores were similar for the two tests ( Table 5). The two tests performed at different times showed correlation coefficients of 0.84, 0.90, 0.92 and 0.91 for the vertical score, horizontal score, ratio score and total error score, respectively, indicating that the tests had high time reliability. To assess tester reliability, the two testers were switched, and they then re-measured a random sample of 180 children. The correlation coefficients of the two tests performed by each tester were 0.99 for the vertical score, 0.99 for the horizontal score, 0.99 for the vertical-horizontal ratio, and 0.99 for the total error score ( Table 6). These results indicated the good reliability of the DEM test. Significant declining trends were observed for the vertical score, horizontal score and total error score with increasing age ( Table 3). The internal consistency validity was investigated by comparing the correlation coefficients among the three eye movement indices, which were significantly different (the vertical score was P<0.05 and the horizontal score and ratio score were P<0.001, Table 7). Our results indicate that the DEM test norms for the Mandarin Chinese-speaking children can be used to accurately assess their clinical visual-verbal ocular motor functions.

Discussion
Children with poor reading skills include those with dyslexia, non-dyslexic reading disabilities and poor linguistic cognition and non-linguistic perceptual cognitive processing, as well as other problems. Appropriate norms for specific languages should be used to allow examiners to determine accurate DEM scores and to provide proper diagnoses for reading problems. The DEM test is a standardized clinical test that is recommended for the assessment of children with reading problems and other learning-related vision problems [13,21]. In the present study, we used stratified random sampling to recruit 1,425 children aged 5 to 12 years from eight kindergartens and eight primary schools in Nanjing. Similarly to the original DEM study [17], our participants were recruited from urban public and suburban public schools. Hence, the present study achieved sample representativeness. Importantly, our study assessed preschool children aged 5 to 6 years, whereas other studies have not evaluated this age group. Therefore, our DEM test can aid in the early identification of children at risk of reading problems other than dyslexia, thus providing a more economical and beneficial alternative to later intervention and treatment.
Mandarin Chinese has characteristics similar to Cantonese Chinese. For example, both are ideographic languages that differ from phonetic languages, such as English. Thus, we proposed the use of "adjusted vertical scores" for the vertical score norms for the Mandarin Chinesespeaking children in view of the relatively high incidence of vertical errors in our study. Our results showed significant declining trends in the vertical and horizontal scores and in the total error score with increasing age and grade level. Our results are consistent with the general notion that a child should be able to read faster with age because of the gradual development and maturity of automaticity and eye movement. No significant differences were observed in the horizontal score and the vertical score among the 10-, 11-and 12-year-old age groups. Moreover, no significant differences in the ratio score and the total error score were observed among the 8-, 9-, 10-, 11-and 12-year-old age groups. This finding (the change in horizontal and vertical scores) is in accordance with the notion that a child's reading speed will plateau as he/she ages. Furthermore, our test involved the use of simple single-digit items; thus, it was rather easy for the adolescents to perform. Our results also showed that the Mandarin Chinese-speaking children were able to complete the vertical and horizontal DEM tests significantly more rapidly than both the English-and Spanish-speaking children in all age groups. This result suggests that the eye movement speed of the Chinese children with regard to the movement indices was significantly faster than those of their English and Spanish counterparts. The Chinese children obtained significantly lower vertical-horizontal ratio and total error scores than the English and Spanish children did. This finding may be the result of several factors. First, Chinese is an ideographic language, and numbers are easier to pronounce in Chinese than in English [22]. Therefore, the duration of each eye movement index differed between the two populations. However, the results of the DEM test are similar between phonetic languages, such as English and Spanish, as demonstrated by previous studies of Spanish and English children performed by Fernandez-Velazquez and Fernandez-Fidalgo [20] and Jiménez et al. [23]. However, differences among phonetic languages were found in Baptista et al.'s study [24], and they considered that different languages, educational systems and cultures may explain these differences. Moreover, we cannot determine exactly what factor(s) might account for the DEM scores differences at this time. Second, the time at which reading training is initiated differs between Western countries and China. In China, children often begin reading training earlier than their American counterparts [25]. Chinese parents notice the academic development of their children during kindergarten [25]. The vertical and horizontal scores of the children in our study differed significantly from those of the Cantonese-speaking children in the 6-, 7-, 8-, and 10-year-old age groups, which might be related to educational and economic level differences between the two samples. Jiménez et al. [23] found no differences in the DEM test results between Spanish-speaking children and the original group of English-speaking children described by Garzia and colleagues [18]; therefore, it is possible that differences in language, cultural background, and educational systems affect DEM scores.
In the current study, we excluded subjects with neurological disease, physical disabilities and mental retardation and those who could not complete the DEM test because of an inability to stay on task even after redirection. The experienced examiners were able to detect reading errors easily [26]. The manner in which the DEM test is administered might also contribute to its accuracy. The DEM test-retest has been shown to have excellent reliability when it is used to evaluate patients undergoing vision therapy evaluations [27]. The present study indicated good test-retest reliability of the DEM. The internal consistency was investigated by comparing the correlation coefficients among the three eye movement indices. Only DEM tests with sufficient reliability will yield accurate measurements.
In addition, the time reliability test and tester reliability test were conducted separately. For the time reliability test, the correlation coefficient of the two tests performed at different times showed a vertical score of 0.84, a horizontal score of 0.90, a vertical-horizontal ratio of 0.92, and a total error score of 0.91. Accurate assessments of reading errors were performed by two experienced examiners. To assess tester reliability, we switched the two testers, who then remeasured a random sample of 180 children. The correlation coefficients of the two sets of tests were 0.99 for the vertical score, 0.99 for the horizontal score, 0.99 for the vertical-horizontal ratio, and 0.99 for the total error score. These results indicated the high reliability of our localized DEM test. Our results are consistent with those of Garzia et al. [16] and Jimenez et al. [23], who reported reference values that were relatively similar to those of the US population, with the exception of the 5-year-old age group in the Spanish study.
Our analysis of the DEM test allowed us to establish DEM test norms for children in Nanjing (aged 5 to 12 years). Because in China, children often begin reading training earlier than their American counterparts [22], Chinese parents emphasize academic development at an early age. By the time children enter kindergarten in China, they have already learned how to read Arabic numerals. Second, Nanjing is a moderately developed city. The city participates in scientific research, including research related to childhood obesity in China. Third, the subjects who participated in this study had already passed a DEM pretest. Moreover, all of the participants were recruited from the main urban and suburban areas of Nanjing; thus, the selected participants accurately reflect the entire Chinese population. Thus, the norms of the DEM test will be proposed for and used to evaluate Mandarin Chinese-speaking children throughout China. A considerable number of statistically significant differences between the Mandarin Chinese population study group and the other groups were observed. Finally, the DEM test norms for Mandarin Chinese-speaking Chinese children showed good reliability and validity and could be used to assess their clinical visual-verbal ocular motor functions.
Supporting Information S1 File. The DEM norms table for Cantonese-speaking children. The DEM norms table for Cantonese-speaking children from age 5 years to age 5 years, 11 months (