Bilingualism and Performance on Two Widely Used Developmental Neuropsychological Test Batteries

The present study investigated the effect of bilingualism on the two widely used developmental neuropsychological test batteries Wechsler Intelligence Scale for Children – Fourth Edition (WISC-IV) and A Developmental Neuropsychological Assessment, Second Edition (NEPSY-II) in children. The sample consisted of 100 Finland-Swedish children in two age groups. About half (n = 52) of the participants were early simultaneous bilinguals, and the other half (n = 48) were monolinguals. As no Finland-Swedish versions of the tests are available at the moment, both tests were translated and adapted to suit this population. The results revealed no difference in the performance between bilingual and monolingual children. This speaks against a cognitive advantage in bilingual children and indicates that development of separate norms for monolingual and bilingual children is not needed for clinical use.


Introduction
When investigating cognitive development in children, researchers have found an advantage in bilinguals concerning certain cognitive abilities, especially nonverbal executive control and theory of mind [1][2][3]. The reason for a possible advantage is thought to stem from the fact that managing two languages requires executive resources for selecting the relevant language and inhibiting the language not in use at the moment [4][5]. Some studies argue that the advantage for bilinguals arises when the task has higher executive demands [2,[6][7], and when the bilinguals are balanced in their two languages [8][9]. There is, however, a growing body of studies indicating that the results regarding a bilingual advantage are inconsistent and that there has been a clear publication bias towards studies showing significant results [10][11]. Some also argue that the bilingual advantage can be explained by other factors, such as socioeconomic status and small sample sizes [12][13]. Recent large-scale studies with well-matched groups have not found support for the hypothesis of a bilingual advantage [14][15][16][17]. Paap, Johnson, and Sawi [13] compared individuals with different levels of bilingualism, but the bilingual groups were not better than the monolingual group on any of the executive function tasks. In verbal tasks, however, monolinguals have been shown to perform better than bilinguals [4,[18][19][20]. This may partially be due to the fact that monolinguals possess a larger vocabulary in the language that is investigated than bilinguals do [21]. Statistically controlling for this group difference in verbal performance may bring up differences in executive performance that are more apparent than real (see [22]).
Despite the earlier literature suggesting an executive advantage in bilinguals, few studies have investigated the clinical relevance of bilingualism to neuropsychological testing. Among the most frequently used neuropsychological tests for children are the Wechsler Intelligence Scale for Children-Fourth Edition (WISC-IV [23]) and A Developmental Neuropsychological Assessment, Second Edition (NEPSY-II [24]). Studies using WISC or NEPSY to assess differences in performance between bilingual and monolingual children have only administered some of the subtests, and the results have not been fully in line with the general findings regarding bilingualism and cognitive ability. Bialystok and Majumder [8] used the subtest Block Design from WISC-R [25] to measure 7-to 9-year-old monolingual and bilingual children's ability to perceive and analyze patterns, and found that bilinguals received higher scores than monolinguals in the task. Lauchlan, Parisi, and Fadda [26] administered four WISC subtests [27][28], Block Design, Digit Span, Vocabulary, and Arithmetic to mono-and bilingual children (mean age range for the different groups: 9 years and 1 month-10 years and 4 months), and found an advantage for bilingual children in Block Design and, quite surprisingly, in Vocabulary. No significant group differences emerged for the Digit Span or Arithmetic subtests. Korkman et al. [29] used the NEPSY [30] to investigate 5-to 7-year-old bilingual and monolingual Finland-Swedish children's verbal capacity. Of the subtests administered (Body Part Naming, Speeded Naming, Comprehension of Instructions, Repetition of Nonsense Words, Narrative Memory, and Sentence Repetition), the bilingual children scored significantly lower on Body Part Naming. Garratt and Kelly [31] compared the performance of 6-to 7-year-old monolingual and bilingual children, unbalanced in their two languages, on the 14 core subtests of NEPSY [32]. The bilinguals scored significantly lower than the monolinguals in the two verbal subtests Speeded Naming and Comprehension of Instructions. The bilinguals also scored significantly lower than monolinguals in Visual Attention. In contrast, the bilinguals outperformed the monolinguals on Imitating Hand Positions and Design Copying. No significant differences in performance between the groups were found on the remaining NEPSY core subtests (Tower, Auditory Attention, Phonological Processing, Finger Tapping, Visuomotor Precision, Arrows, Memory for Faces, Memory for Names, Narrative Memory).
Given the variable results from the small number of available studies reviewed above, it is evident that more research is needed to study the possible effects of bilingualism on widely used developmental cognitive test measures. Firstly, this is relevant for the clinical use of these test instruments where the test results of a child can have important diagnostic consequences. In borderline cases, a possible positive or negative effect of bilingual language background that the diagnostician is unaware of could tip the balance of a clinical decision in one direction or the other. Secondly, albeit standardized cognitive test batteries are not designed to tap functions specifically associated with bilingualism, they can provide evidence relevant to the issue of a possible bilingual executive advantage. Accordingly, we investigated how bilingualism affects the performance of two age groups of Finland-Swedish children on selected subtests of the two widely used neuropsychological tests Wechsler Intelligence Scale for Children-Fourth Edition (WISC-IV) and A Developmental Neuropsychological Assessment, Second Edition (NEPSY-II). Finland-Swedish bilingual children provide a particularly interesting study group, as they grow up in an officially bilingual society and their culture is very close to that of their monolingual Finnish-speaking peers. Children have equal opportunities to use their preferred home language as a day care and school language. Likewise, books, newspapers and TV programs are accessible in both languages. Thus, potentially confounding effects, such as cultural and socioeconomical factors, often present between a language majority and minority, are minimized in the present group comparison.

Participants
The sample consisted of 100 Swedish-speaking children recruited through public schools in Finland. The children represented two separate age groups; 50 children were 7 years and 1 month to 7 years and 6 months, while 50 children were 10 years and 10 months to 11 years and 2 months. In the younger sample, 25 (50%) children were bilingual and 25 (50%) monolingual, while 27 (54%) were bilingual and 23 (46%) monolingual in the older sample. All of the monolinguals were Swedish speakers. Of the bilingual children, 48 were reported by their parents to have acquired both languages by the age of three, and all bilingual children had learned their second language by the age of six. One child in the younger age group, who was reported to speak three languages fluently, was excluded from the analyses. Information regarding the language abilities of the bilinguals is presented in Table 1.
In order to ensure geographical representativeness, stratified sampling was used. The Swedish-speaking area of Finland was divided into three main geographical regions; the Helsinki Capital region, Ostrobothnia, and the remaining Swedish-speaking areas in southern Finland (encompassing coastal areas east and west of the Capital region). The Åland Islands were not included due to the similarity of its language culture with that of Sweden. Information from the reporting portal of the Finnish National Board of Education (Vipunen [33]) was used to determine the number of children needed from each region.
Subsequently, a random sampling of six schools from each region was carried out. Schools with less than 12 students at the 1 st , 2 nd or 3 rd grade in 2011 according to the Finnish National Board of Education [33] were excluded due to practical reasons. Private schools, language immersion schools, and schools for children with special needs were also excluded. The directors of education in the municipalities of the resulting 18 schools were contacted and after their approval, the headmasters of each school were contacted, and letters were sent out to the parents of all 1 st , 4 th and 5 th graders. The letter contained information about the study and two forms for the parents of the children to fill out, namely a background information form and an informed consent. Due to non-responders and the need to exclude one school in Ostrobothnia that turned out to have a class size smaller than 12, reminder letters were sent out. Also, four of the schools in the Capital region declined participation. Thus, another school in the Capital region was randomly selected and contacted. The total amount of letters sent out was 990, of which 359 (36.3%) were returned. Of these returned letters, 314 (87.5%) accepted participation in the study. Children outside the specific age range of the study were excluded, as well as children with diagnosed psychiatric or neurological disorders. Remedial instruction/education did not serve as exclusion criterion. As more letters than needed were returned, the inclusion order was based on the randomization order of the schools and the date the letters were returned. See Table 2 for a description of the sample.

Measures and procedure
The measures used in the present study were the Wechsler Intelligence Scale for Children-Fourth Edition (WISC-IV) and A Developmental Neuropsychological Assessment, Second Edition (NEPSY-II). For WISC-IV, all subtests except the supplementary ones were administrated. For NEPSY-II, a subtest was included in the battery if it contained a language component or had a Cronbach's alpha reliability value of more than .7 for the selected age groups.
The reliability values used were those reported in the Finnish version of NEPSY-II [16]. The selected subtests belonged to the domains Alertness and Planning, Linguistic Function, and Memory and Learning. All administered subtests and the cognitive components they are thought to measure are presented in Table 3. The data collection was carried out from January to April 2014 by five advanced psychology students trained in administering the measures. The children were tested individually in a single session, in a room provided by their own school. The session lasted approximately 3 hours (1.5 hours per test), and the test order (WISC-IV, NEPSY-II) was counterbalanced. Translation of the measures. As neither of the tests used in the present study have been adapted for the Swedish-speaking population of Finland, the Finnish versions of both WIS-C-IV [23] and NEPSY-II [24] were translated into Finland-Swedish versions. As the language in Sweden differs from the Finland-Swedish one, the Swedish versions of the tests [34][35] could not be used. However, when appropriate, the Swedish versions of the tests were used as Table 3. The subtests used in the present analyses and the cognitive components they are thought to measure [23][24].

Subtest
Cognitive component support in the translation process. Psychologists, teachers, researchers, and linguists were involved in the translation process, and the resulting test was piloted on a small number of children. For all subtests, the test instructions and verbal scoring criteria were translated from Finnish into Swedish. With regard to WISC-IV Similarities, the direct translations for two of the items turned out to have multiple meanings in Swedish. Therefore, these items were changed to other words that were considered equivalent. In the same vein, seven items in the Vocabulary subtest were altered due to linguistic and cultural reasons. Concerning NEPSY-II, the Swedish version of the accompanying CD in NEPSY [30] was used for the subtests Auditory attention and response set A and B. Also for Phonological processing, the Swedish version of the subtest was used, and in the phonological part of Word generation, the letters from the Swedish version were presented to the children. In Memory for names, the cards were taken from the Finnish version of NEPSY-II, while the names were taken from the Swedish version. For Narrative memory, the story was translated into Swedish from the Finnish version of NEPSY-II, but some words, such as names, were taken from the Swedish version of NEPSY-II.

Ethics Statement
The study was approved by the Institutional Review Board of the Department of Psychology and Logopedics at Åbo Akademi University. The parents of the children signed a written informed consent prior to participation.

Missing values
Due to administration errors (some of the subtests were terminated before the criterion of number of errors were reached), scores for 33% of the children in Vocabulary, 18% in Letter-Number Sequencing, 9% in Comprehension, and 25% in Word List Interference, were not available for the statistical analysis, but replaced with imputed values. The scores were assumed to be missing at random, and multiple imputation was therefore used to handle the missing data. In the imputation process, several versions of the dataset were generated with the fully conditional specification, an iterative Markov chain Monte Carlo (MCMC) method in the IBM SPSS Statistics (version 21) Multiple Imputation module. Graphical diagnostics suggested that 400 iterations were sufficient to reach convergence, so a dataset was saved every 400 th iteration until 100 filled-in copies of the dataset had been computed. To avoid generating outlying values, imputed values were constrained to stay within the ranges specified by the manuals for the tests. The raw scores from all the administered subtests in both WISC-IV and NEPSY-II were used as predictors. The subsequent analyses on imputed variables were performed separately on each complete dataset, and Rubin's [36] rules were used to summarize the parameter estimates and their standard errors into a single set of results. These pooled statistics will be reported when any of the variables containing imputed values are included in the analyses.

Results
In order to ensure that the two language groups were balanced on gender, remedial teaching and parent educational level, two-sided Fisher's exact tests were conducted separately for each age group. In these analyses, no differences with regard to gender, remedial teaching, or the educational levels of the mothers were found between the monolingual and bilingual children in neither age group. However, a difference in the educational levels of the fathers between the language groups was found in the younger age group (p = .003). The fathers of the bilingual children had a higher level of education than the fathers of the monolingual children. Only the statistics of significant results involving the language groups will be reported here.
The average raw scores of the monolinguals and bilinguals on each administered subtest are presented separately for each age group in Tables 4 and 5. To investigate the effect of bilingualism on test performance, ANOVAs were conducted. In these analyses, the raw score of each subtest, or the standardized scores of each index, served as dependent variables. The independent variables were language group (monolingual; bilingual) and age group (younger; older). For imputed datasets, IBM SPSS Statistics (version 21) does not provide pooled F-statistics for ANOVAs. Therefore, linear regression analyses were conducted when the analyses included variables containing imputed values. The dependent variables in these analyses were the raw score of the subtests or the standardized scores of the indexes, while language group (monolingual; bilingual) and age group (younger; older) were used as predictors.
The ANOVA on Symbol Search revealed a significant main effect of language group, F(1, 95) = 9.30, p = .003, η 2 partial = .089, and a significant interaction between language and age, F(1, 95) = 5.85, p = .017, η 2 partial = .058. The results indicated that in the younger age group, the monolinguals received higher scores than the bilinguals. No other statistically significant effects of language group were found in any of the analyses.

Discussion
The present study investigated whether bilingualism affects school-aged children's performance on selected subtests of the two widely used neuropsychological tests Wechsler Intelligence Scale for Children-Fourth Edition (WISC-IV) and A Developmental Neuropsychological Assessment, Second Edition (NEPSY-II). The sample consisted of 100 Finland-Swedish children in two age groups, 7-year-olds and 10-to 11-year-olds. Overall, our results indicated that the bilingual vs. monolingual language background do not affect children's performance on these cognitive test batteries. The only statistical difference was that the monolinguals scored significantly higher than the bilinguals on the subtest Symbol Search in WISC-IV. This difference was found only in the younger age group.
Previous research on WISC has found bilinguals to perform better than monolinguals on Block Design and Vocabulary [8,26]. However, Bialystok and Majumder [8] compared an English-speaking monolingual group to a French-English bilingual and a Bengali-English bilingual group. The school language was French in one of the groups and English in the two others and the tests were administered in English. Also, Bialystok and Majumder [8] did not take socioeconomic status into account. Lauchlan, et al. [26] compared bilingual children from Scotland and Sardinia to monolingual English and Italian speaking children. The difference in vocabulary favoring the bilingual children was found only for the Scottish group but not for the Sardinian children. Therefore, the group differences reported by Lauchlan et al. [26] and Bialystok and Majumder [8] may stem from other factors than bilingualism.
Regarding previous studies comparing bilinguals and monolinguals on NEPSY, the present results are in line with the Finnish study by Korkman et al. [29], where no difference in performance was found between monolinguals and bilinguals on the subtests Speeded Naming, Comprehension of Instructions, and Narrative Memory of the NEPSY-II test battery. Of the subtests administered in the study by Garratt and Kelly [31], the present study administered Auditory Attention, Phonological Processing, Speeded Naming, Comprehension of Instructions, Memory for Names, and Narrative Memory. In contrast to the results from the present study and the study by Korkman et al. [29], the performance of the bilingual children in the Garratt and Kelly [31] study was weaker than that of the monolinguals on the two verbal subtests Speeded Naming and Comprehension of Instructions.
The present study did not find significant performance differences in favor of the bilingual group and therefore does not support previous studies showing a bilingual advantage in children. It should also be noted that the results from previous research have been inconsistent regarding a bilingual advantage [1-2, 11, 13]. De Bruin, Treccani, and Della Sala [10] suggest that there has been a clear publication bias towards studies showing significant results. Some also argue that the bilingual advantage can be explained by other factors, such as socioeconomic status and small sample sizes [12][13]. The question of whether there is a bilingual advantage has thus not yet been answered, despite the great number of studies investigating the issue (see, e.g., [1-2; 11, 13]).
The bilingual advantage is typically seen in tasks measuring executive functions, such as inhibition of irrelevant information and working memory [1][2]. The test batteries employed here mainly include subtests that are not primarily executive (although performance in all subtests naturally enough requires some executive processes). The ones with the greatest demands on executive functioning are Auditory Attention A (attention) and B (set shifting, inhibition, and working memory), Word Generation (attention, working memory), Word List Interference (working memory), Digit Span (working memory) and Letter-Number Sequencing (working memory), but no group differences were seen on these subtests either.
In verbal tasks, monolinguals have often been shown to perform better than bilinguals [4,[18][19][20]. In the present study, the bilinguals had learned both languages before the age of 7, and the tests were administered in the same language as they used in school, which may explain why no differences in any verbal subtests between the language groups were found.
The single significant result indicating an advantage for monolinguals on the subtest Symbol Search that taps perceptual speed has not been reported previously and escapes a clear theoretical explanation in the present context. It may, thus, represent a chance finding as it was the only significant group difference, observable in one age group but not in the other, and difficult to explain in the light of previous research on bilingualism and cognitive function.
The effects of bilingualism on cognitive performance remain a hot research topic. In sum, the results from the present study show no difference between monolingual and bilingual children on selected subtests of WISC-IV and NEPSY-II. The main implication of our study for clinical psychological practice is that for diagnostic use, there is no need for separate norms for monolingual and bilingual children.