Skip to main content
Browse Subject Areas

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

The impact of bilingualism on executive functions and working memory in young adults

  • Eneko Antón ,

    Roles Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Writing – original draft, Writing – review & editing

    Affiliations Facultad de Lenguas y Educación, Universidad Nebrija; Madrid, Spain, BCBL. Basque Center on Cognition, Brain and Language; Donostia, Spain

  • Manuel Carreiras,

    Roles Conceptualization, Funding acquisition, Investigation, Supervision, Validation, Writing – review & editing

    Affiliations BCBL. Basque Center on Cognition, Brain and Language; Donostia, Spain, Ikerbasque, Basque Foundation for Science; Bilbao, Spain, Euskal Herriko Unibertsitatea–Universidad del País Vasco; Bilbao, Spain

  • Jon Andoni Duñabeitia

    Roles Conceptualization, Data curation, Funding acquisition, Investigation, Methodology, Project administration, Resources, Supervision, Validation, Writing – review & editing

    Affiliations Facultad de Lenguas y Educación, Universidad Nebrija; Madrid, Spain, BCBL. Basque Center on Cognition, Brain and Language; Donostia, Spain


A bilingual advantage in a form of a better performance of bilinguals in tasks tapping into executive function abilities has been reported repeatedly in the literature. However, recent research defends that this advantage does not stem from bilingualism, but from uncontrolled factors or imperfectly matched samples. In this study we explored the potential impact of bilingualism on executive functioning abilities by testing large groups of young adult bilinguals and monolinguals in the tasks that were most extensively used when the advantages were reported. Importantly, the recently identified factors that could be disrupting the between groups comparisons were controlled for, and both groups were matched. We found no differences between groups in their performance. Additional bootstrapping analyses indicated that, when the bilingual advantage appeared, it very often co-occurred with unmatched socio-demographic factors. The evidence presented here indicates that the bilingual advantage might indeed be caused by spurious uncontrolled factors rather than bilingualism per se. Secondly, bilingualism has been argued to potentially affect working memory also. Therefore, we tested the same participants in both a forward and a backward version of a visual and an auditory working memory task. We found no differences between groups in either of the forward versions of the tasks, but bilinguals systematically outperformed monolinguals in the backward conditions. The results are analysed and interpreted taking into consideration different perspectives in the domain-specificity of the executive functions and working memory.


The core assumption of the bilingual advantage (BA) hypothesis [1] is that bilingualism provides enhanced executive function abilities as a consequence of the constant use of two languages. According to Miyake and Friedman’s model [2], the executive functions (EF) encompass inhibition (i.e., the ability to suppress dominant or salient responses), shifting (the capacity to switch between tasks), and monitoring (the ability to update the information in the working memory; see [2,3]). These components are necessary for the cognitive control of human behaviour, and although they have been shown to be interrelated to some extent, they are separated entities that contribute to behaviour in different manners [2]. Among these behaviours, language control is of special interest to us: as the two languages that a bilingual speaks are always active [46], bilinguals need to monitor and constantly update the demands of the context they are immersed in and the speakers they are talking to, and switch to the target language and inhibit the non-target one (i.e., an efficient use of language control, see, for example, the IC model, [7]). As it can be seen, language control makes use of EF mechanisms [8]. Although EF abilities have been shown to have a strong genetic inheritance [9], some studies have shown that they can be improved with training [1012]. Importantly, as EF have been generally assumed to be domain-general (i.e., the same underlying mechanisms would be responsible for language control and any other kind of executive control, see [1315]), it could be hypothesized that bilinguals’ high language control could positively impact language control skills, and transfer to and be captured in any situation that requires the use of domain-general EF abilities. For example, this could transfer to tasks such as flanker [16], Simon [17] and Stroop [18], which give an indication of participants’ general inhibitory abilities (but see [19]) when RTs and errors to congruent and incongruent trials are compared. In this line, some bilingual samples have displayed better inhibitory abilities than monolinguals–i.e., smaller differences between congruent and incongruent trials–in the Stroop [20], Simon [1,21,22] and Flanker task [23], and this has been accounted by and related to the constant need of inhibiting the non-target language (or stimulus) while managing two languages.

The bilingual advantage, however, is not only one of the most popular research topics in the field of bilingualism nowadays, but also one of the most controversial ones as there is a growing debate regarding its existence and origin [24,25]. For example, Costa et al. [23] found that bilinguals were overall faster than monolinguals in the flanker task (see also [26]), but both in congruent and in incongruent trials. Better inhibitory abilities would only be expected to improve responses to incongruent trials, they argued [27], so bilinguals’ overall faster performance was linked to the bilinguals’ better monitoring abilities, stemming from their constant need for overseeing the linguistic demands of the current environment in order to be able to choose the appropriate language. Indeed, they only found faster overall reaction times in high-monitoring versions of the task with a demanding enough environment ([27], see [1,26], for similar conclusions; but see [19], for a discussion on the impurity of the use of global RTs as a measure of monitoring). These mixed results prevent researchers from drawing strong conclusions as to which EF component is enhanced by bilingualism, inhibition or monitoring.

Some authors have raised some concerns regarding the results and interpretation of the bilingual advantage [24,2831], arguing that the evidence favouring the BA is actually a consequence of uncontrolled external factors, small sample sizes and task-dependent effects. Among the former group, socio-economic status (SES), immigrant status and ethnicity background have been shown to affect EF abilities [3234], factors that tend to differ between bilingual and monolingual individuals in certain populations. For example, immigrants–who tend to be bilinguals–show better morbidity and mortality outcomes than non-immigrants around the world [3540], which is known as “the healthy immigrant” effect, displaying also a higher educational profile or IQ level [4143] than non-immigrants. Those factors could potentially cause differences between groups in EF, and there would be no way of disentangling the potential effects of bilingualism from those produced by the uncontrolled factors. Upon reviewing the existing literature showing a bilingual advantage, one could find that the abovementioned concern seems to be the rule more than the exception for the majority of the studies. We observe studies in which SES was not controlled for [14,44], in which comparisons are made between monolinguals and bilinguals that were tested in different countries [45], or in which the bilingual sample tested included the majority of immigrants [20]. Furthermore, when large samples of participants are matched in the confounding variables [24], the bilingual advantage systematically vanishes, with comparable performance or monolingual and bilingual children [4648], young adults [30] and older adults [4952]. Very recently, Lehtonen et al. [53] explored in detail 891 effect sizes from 152 different studies, both published and unpublished, that compared bilinguals’ and monolinguals’ performance in six different EF tasks. They found very little evidence supporting the bilingual advantage theory, which disappeared when the observed publication bias was corrected for. In addition, and somehow confirming the low reliability and replicability of the bilingual advantage effect, significant findings on bilingual advantage happen principally when sample sizes are small (around n<30, see [31]). Furthermore, these effects are not always found across the tasks that are assumed to measure the same construct of executive control [26]. As Paap and his colleagues argue, for the hypothesis of the bilingual advantage to be coherently demonstrated, the advantage should be present at least in two different tasks that tap into the same cognitive ability, and the markers of those tasks should correlate, which seems not to be the case (see, for example, [30]). This little or no convergent validity questions the domain-generality of the EF and, as a consequence, the improvement transfer from language control training to enhanced EF provided by bilingualism.

Hence, the first aim of the present set of tasks is to test the reliability and replicability of the bilingual advantage in EF. Similarly to what earlier studies of our group with children [4648], and the elderly [49], in the current study we tested large adult samples of bilinguals from a bilingual community and monolinguals from a monolingual community of the same country in the classic aforementioned tasks, while controlling for potentially confounding factors. To further check for the influence of external factors, bootstrapping and regression analyses were also conducted, given that it has been proposed that the effects suggesting a bilingual advantage may emerge when samples are small and external sociodemographic factors are not controlled for [24,30]. A bootstrapping analysis of different sample sizes would provide information regarding how often a significant differential effect between bilinguals and monolinguals emerges with small samples, and how often this advantage co-occurs with unmatched sociodemographic factors. Regression analysis would provide further information of how much each of the external factors contributes to the final performance. Also, the assumption of the domain-generality of the EF will be tested by running correlation analyses between the indices obtained in different tasks [24].

Interestingly, the bilingual advantage has been recently associated with an enhancement of working memory (WM) abilities as well, understood as the cognitive capacity to temporarily hold information for processing [54]. Many studies have explored WM abilities and different models have been proposed, but probably Baddeley and Hitch’s model [55] is the most widely accepted one. These authors propose a multicomponent model, with a central executive system that directs attention to, manipulates, and controls and integrates the flow of information from and to the slave systems, the phonological loop (which stores verbal content) and the visuo-spatial sketchpad (responsible of visuo-spatial information). Those two stores appear to be independent, so even if one is being used the other would still be free. Similarly to what occurs with EF, it has been shown that WM is also liable to be improved by training (see, among many others, [5659];but see also [60]; for evidence against the beneficial effects of training in WM). Therefore, as managing languages that are constantly competing for selection requires the use of WM resources [61], it is logical to assume that this training would improve bilinguals’ WM [62]. Some authors defend that the reported bilingual advantage effects could actually be reflecting improved WM abilities rather than better EF skills [63]. However, the relation between bilingualism and WM is a hard domain to explore in isolation, because the concepts of EF and WM are closely connected [54] and there is a strong positive relation between both capacities [64]. Indeed, Baddeley’s central executive system [55] clearly resembles the definition and the functionalities of the EF as described by Miyake and Friedman [2]. Thus, different aspects of EF operate with the information held in the WM, and differences in EF abilities have been argued to correlate with differences in WM capacities [65], especially in complex memory tasks with high demands of storing and processing of information [66]. Furthermore, it is important to note that the definitions traditionally given to WM [64] and updating /monitoring [3] overlap in that they both define the ability to manipulate information in the primary memory; and often have been equated [67], although convergent validity analyses have recently shown that they are clearly related but separated components [68]. As it is possible that what has been observed as a bilingual advantage in EF might have been a reflection of an advantage in a mediating overlapping factor–WM–, “any finding of an advantage in controlled attention would be far more convincing if WM were held constant” [63]. In children, several studies have found a bilingual advantage in EF while keeping WM constant (see [26], but see [69] for similar findings obtained in samples coming from clearly different linguistic backgrounds, indicating different immigrant statuses).

Given that the abovementioned classic EF tasks also require WM capacities, inasmuch as a rule has to be kept in mind to adequately respond to the task, the outcomes would be a product of both EF and WM abilities. Thus, we will explore the potential impact of bilingualism on WM abilities in isolation, by using the tasks tapping into them alone, where no inhibitory abilities or switching are needed. The studies conducted so far using non-linguistic memory tasks show inconsistent and unclear evidence: Bialystok et al. [1] tested young and old monolinguals and bilinguals in an easy (with two colour cues and response possibilities) and a hard (four different cues and responses) version of the Simon task, and found that the increase in difficulty (i.e., increase in the WM load) was handled better by the bilinguals than by the monolinguals (but see [70] for no differences after controlling for socio-demographic factors). Bialystok, Craik, and Luk [20] found no differences between bilinguals and monolinguals in the Self-ordered pointing tasks [71] and minor differences in the Corsi task [72,73], where bilinguals recalled more items than monolinguals, with no differences between forward or backward repetition conditions. Luo, Craik, Moreno and Bialystok [74] tested younger and older monolingual and bilingual adults in forward and backward verbal and spatial (Corsi task) WM tasks, and bilinguals outperformed monolinguals on the spatial tasks but monolinguals did better on verbal tasks. Later, Ratiu and Azuma [75] tested 52 bilinguals and 53 monolinguals in a variety of simple and complex WM tasks, including a backward digit-span task, standard operation span task and a non-verbal symmetry task. Although bilinguals had a significantly higher educational level, which could arguably facilitate the appearance of a bilingual advantage, analyses indicated that monolinguals showed a significantly higher score than bilinguals in operation span and backward digit tasks, with no differences in the symmetry task. However, in a multivariate regression analysis, speaker group (bilinguals vs. monolinguals) did not predict scoring in any of the tasks, nor it interacted with any other factor. Interestingly, we observe that some of the concerns with respect to the influence of external factors [24,30,31] do apply to these studies also. Namely, the bilinguals tested by Bialystok, Craik, and Luk [20] spoke a huge variety of second languages, indicating different cultural (and probably ethnical) backgrounds, and the SES was not measured. Crucially, out of 24 participants per group, 14 young bilinguals and 20 old bilinguals were immigrants. In the study by Luo, Craik, Moreno and Bialystok [74], language groups differed in their English vocabulary level and the nonverbal intelligence scores. This was to some extent solved by including those factors in an ANCOVA showing that the effect remained the same, but still their samples feature uneven sizes (58 monolinguals and 99 bilinguals) and language backgrounds (the second language spoken by the young adults varied among a set of more than 10 languages), suggesting differences in ethnicity. Importantly, many other relevant variables such as immigrant status or SES were not reported.

Recently, trying to account for WM differences while controlling for external factors, Hansen and colleagues [76] tested 152 native Spanish children, half of whom were attending a bilingual immersion schooling program (i.e., receiving teaching in both English and Spanish) and the other half Spanish only. Both groups of children were matched in various demographical factors, including SES and intelligence, and they all were from the same city. They were tested in an n-back task (participants have to indicate if the current trial matches the one from n-trials back) and a reading span task (a task that requires remembering the last word of a presented set of sentences), as well as a rapid automatic naming task to measure participants’ verbal processing speed. They found that young bilinguals performed better than monolinguals in the n-back tasks, a task that heavily relies on updating (monitoring and updating the items held in the WM) and inhibition (the relevant item changes in every trial). On the other hand, young bilinguals showed a disadvantage in the reading span task (which requires mostly linguistic processing and verbal storage), but an advantage in older groups (see also [77], for an extended exploration on the effects of emergent bilingualism on literacy). Hansen and colleagues argue for an effect of bilingualism on WM during the first years of immersion due to the higher demands that children might face when they encounter bilingualism for the first time. But, this difference would just modulate the development of said skills, and importantly they would equalize eventually (on the same regard, see also [62]). However, in a similar longitudinal perspective to explore the effect of bilingualism in WM during adulthood, Ljunberg and colleagues [78], found that bilinguals systematically outperformed monolinguals ranging from 35 to 80 years of age in several tasks tapping into episodic memory and letter fluency tests.

As it can be seen, there are few studies that specifically looked at the effects of bilingualism on WM. The pattern of results ranges from similar performance of bilinguals and monolinguals [26,79] to benefit of bilingualism [20,62,78], or even bilingual disadvantage [75]. Considering the opposing pieces of evidence presented so far and the lack of control of the relevant factors [24,30,31] that some studies feature, consistent conclusions cannot be drawn from those data. As the second aim of the present study, we will test the effects of bilingualism on WM with similar samples of bilinguals and monolinguals that only differ in linguistic background. The tasks used here will be the Corsi task, used to capture the visual span capacity (a putative index of the visual sketchpad component of the WM system, see [80]) and the digit span task, as an indicator of the phonological loop component of the WM [81].

The current experiment aims at testing the effects of bilingualism on both EF (first set of four tasks) and WM (second set of four tasks) with large cohorts of carefully matched monolinguals and bilinguals. Ninety young bilingual adults from the Basque Country (a region of the north of Spain where Basque and Spanish are co-official) and 90 carefully matched monolinguals from Murcia (a south-eastern region of Spain where only Spanish is spoken and official) were tested. To explore the assumption of the bilingual advantage theory that claims that bilingualism enhances general EF that applies to any domain general situation, the degree of cross-task replicability will be also tested [30]. To account for the possible impact of bilingualism on WM, the forward and backward versions of a spatial (Corsi task) and a numerical (digit span task) memory tasks will be used. Thus, while it is impossible to get rid of any possible WM influence in the set of EF tasks, the use of more specific tasks will provide us with information about WM in isolation. To assess the co-occurrence of a bilingual advantage and significant socio-demographic differences, additional bootstrapping and regression analyses will be also conducted.

Materials and methods


180 young adults from Spain took part on these series of tasks. The 90 bilinguals (68 females, mean age 22.29 year, SD = 2.87) were tested in the facilities of the BCBL, in Donostia-San Sebastian (in the Basque Country). On average they had acquired Basque with 0.96 years of age (SD = 1.27) and they reported to have a general proficiency of 8.41 out of 10 (SD = 1.88) in Basque. Self-reports in Spanish proficiency reached a mean score of 8.58 (SD = 1.91), and this language was acquired with an average of 1.13 years (SD = 1.72). Thus, bilinguals were balanced in terms of proficiency (p>.33) and age of acquisition (p>.42). They were also interviewed by a native bilingual research assistant to make sure that they were truly balanced and native bilinguals. The interviewer gave a score to each participant based on their fluency, correctness and conversation abilities, from 1 (very poor/no knowledge of Basque) to 5 (no mistakes, native level). Every bilingual participant obtained a 5 in their Basque interview (for further details on the interview process, see [82]). As opposed to heritage bilingual speakers, Basque bilinguals tend to show a more balanced dominance and exposition to both languages, as they are fully immersed in a bilingual society. Generally, Spanish is more present in the media and leisure activities, but the bilinguals from our sample report to be equally exposed to Spanish (47.78% of their time, SD = 17.07) and to Basque (43% of their time, SD = 17.95), with no statistical differences (p>.18). We found out that they write more in Basque (52.67% of their time, SD = 24.30) than in Spanish (39%, SD = 23.80, p < .01), indicating a predominant tendency for formal education in Basque. On average, they report that they speak 50% of the time in Spanish (SD = 19.11) and 43.22% in Basque (SD = 20.21), a different that although results in marginally significant (p>.08), reflects a highly balanced use of both languages.

The 90 monolinguals (67 females, 21.84 years of age in average, SD = 3.05) were recruited in the region of Murcia, in the south-east area of Spain, and tested in the University of Murcia. They reported to have acquired Spanish with a mean age of 0.68 (SD = .76) and a mean proficiency of 9.13 (SD = .84), with very little or no knowledge of any other language.

Participants from both groups were matched for a variety of factors that could potentially affect our experimental purposes. The matched 90 bilinguals and 90 monolinguals were selected from a bigger sample of 126 monolinguals and 141 bilinguals by means of testing their differences in age, IQ, socio-economic status (SES), educational level and knowledge of Spanish using independent sample t-tests. An estimation of the IQ of each participant was based on their performance on an abridged version of the Kaufman Brief Intelligence Test (K-BIT, [83]) that was administrated during the experimental session. As an indicator of the SES, total monthly income was considered and divided by the amount of household members, thus getting an approximate value of the incomes that each member of household receive monthly on average and make the incomes more comparable across families of different sizes. The majority of the participants had already obtained a university degree (or higher) or were in the process of obtaining one, and this number was virtually identical across groups (88 bilinguals and 87 monolinguals). To control for their proficiency in Spanish (namely, the test language) every participant completed the Spanish version of the LexTale [84] that provides an objective indicator of their Spanish mastery. All these demographic and linguistic variables that could affect the outcomes of the study were thus matched across groups (all ps>.1; see Table 1 for detailed information about the participants). All participants reported normal or corrected to normal vision and signed a consent form according to the principles established by the ethics committee of the BCBL.

Table 1. Demographic factors of the participants.

Mean values are presented together with standard deviation (between parentheses) and the p value resulting from an independent groups t-test with an alpha value of 0.05.


Bilingual participants were tested in the facilities of the BCBL in Donostia-San Sebastián, and monolingual participants were tested in University of Murcia, in Murcia. In both locations, participants went through the experimental session in a room with equivalent settings, with the same equipment used in both labs. The experiment was run using Experiment Builder (SR Research) v 1.10.1385, and the CRT monitor was set to 60Hz in a resolution of 1280x1024 and placed at an approximate distance of 60 centimetres from the participants. Manual responses were recorded using a response box with 7 buttons, being the first one on the left painted in red and the last one on the right painted in green. When audio was played or voice inputs were recorded, it was done by using Sennheisser PC151 headsets.

Following four pseudo-randomized lists, which were the same for both bilinguals and monolinguals, participants performed four tasks aimed at measuring their EF and four other tasks aimed at gathering information about their WM skills. The tasks used to measure EF were the Flanker task [16], Simon task [17], the Verbal Stroop task [18], and the Numerical Stroop task [85]. The tasks used to measure WM were two versions (forward and backward) of the Corsi test [71,72] and the digit span test [86]. All the tasks were conducted in the same experimental session in one day, which lasted around 60 minutes. The duration was approximate and contingent on the participants’ performance (see below for details). The results of the tasks were analysed following the traditional alternative hypothesis testing with ANOVAs, as well as using Bayesian Null Hypothesis testing comparisons [87,88].


For the flanker task, rows of five arrows (←) were displayed on the centre of the screen. For the congruent condition, the central arrow was flanked by four arrows pointing in the same direction (← ← ← ← ←). For the incongruent condition, the central arrow was flanked by arrows pointing in the opposite direction (← ← → ← ←), and for the neutral condition, the arrow was flanked by no arrows (— — ← — —). There were 16 items of each condition, 8 of them with the central arrow pointing to the left and the other half with the central arrow pointing to the right.

In the Simon task participants were presented with a black circle or a black square in the screen, and they were instructed to respond with the red button (on the left of the response box) if they saw a circle or with the green button (on the right) if they saw a square, irrespectively of its position in the screen. Thus, the incongruent condition was created by presenting circles on the right side of the screen or squares presented on the left side of the screen, making participants respond to them with the button on the opposite side of the side in which the figure appeared. The congruent condition was created by presenting the figures in the same side of the screen of the response button needed, i.e., by presenting circles in the left and squares in the right. Finally, the neutral condition was created by presenting the figures in the middle of the screen. There were 16 items of each condition, and half of the items per condition were squares, while the other half were circles.

For the verbal Stroop task, the Spanish words for the colours red, blue and yellow (“rojo”, “azul” and “amarillo”) and three pairwise-matched (with a similar length, frequency and syllabic structure) non-colour words (“ropa”, “avión” and “apellido”, the Spanish words for clothes, plane and surname, respectively) were used as target items. They were arranged to create the congruent (a colour word printed in the same colour that the word indicates; e.g., the word “azul” in blue), incongruent (a colour word printed in a different colour from what it is naming, e.g., the word “rojo” in blue) and neutral (non-colour words printed in any of the colours, e.g., the word “ropa” in red) conditions. In each condition 24 trials were used, and each colour was presented the same amount of times in each condition (8 times), paired equally with every word. All the strings were presented in uppercase Courier New font on a black background, while the colours were set in the RGB-scale values as follows: blue = 0,0,255; red = 255,0,0; yellow = 255,255,0.

For the numerical Stroop task, stimuli consisted in six digits (2, 3, 4, 6, 7, and 8), arranged in pairs to form each trials (e.g., 2–6), one presented on the left side of the screen and another one on the right side. Participants had to say which one was larger in size, ignoring the numerical value. Depending on how the digits were paired, three conditions were created: 24 congruent trials (the larger number in magnitude was also the bigger in size, e.g., small 2-big 6), 24 incongruent trials (the smaller number in size was the larger in magnitude, e.g., big 2-small 6) and 24 neutral trials (two same numbers different in size, e.g., big 4-small 4). In all the conditions “left” and “right” responses were equally distributed, and each digit was used in each condition an equal number of times.

Participants were instructed about the responses before each task, and received indications about the button press or vocal responses in each case. In the Simon, the Flanker, and the Numerical Stroop tasks, there was a short training phase before the experiment started. After a fixation point was displayed in the centre of the screen for 1000ms in black, on a white background, the stimuli was presented on the screen for 5000ms or until the response was given. The order of the stimuli was randomized and there were no breaks. In the verbal Stroop task, after a short training period, a fixation mark was presented for 250ms (a white cross in a black background), and then the target word appeared on the screen for 3000ms, during which participants’ response was recorded

In the Corsi and the Corsi inverse tasks, the participants were presented with 10 blue squares distributed on a screen with a grey background (see Fig 1 for the distribution of the squares in the screen). Those blue squares would change into green in previously established orders, creating the sequences that the participants had to remember. The ten squares were firstly presented in the screen, in blue, for 1000ms. After that, one of them changed to green for 1000ms. Then they all went back to blue for another 1000ms, and the next square in the sequence changed to green. This process was repeated until all the squares of the current sequence were turned to green and back to blue again. Then, a question mark appeared in the middle of the screen, and participants needed to point with the finger which squares changed their colour. In the Corsi task, participants had to indicate the changes in the same order as they occurred. In the Corsi inverse task, they had to indicate the changes in the reverse order as they occurred. There were 8 consecutive blocks with an increasing difficulty level, and in each block two sequences (trials) were presented. The increase in the difficulty was produced by the addition of one more square change to the sequences of each consecutive block. Thus, the trials in the first block consisted of two square changes, the trials in the second block consisted of three square changes, and so on up to 9 square changes in the last block. After each block (that included 2 trials), the experimenter noted the number of correctly recalled sequences. If the participant failed both, the experiment ended. If the participant remembered one or two, the next block started (see Fig 2 for a schematic representation of the experiment, and see Table 2 for the details on each trial).

Fig 1. Spatial distribution of the squares in the Corsi and Corsi inverse tasks and the numbers assigned to each of them.

Fig 2. Schematic representation of the Corsi and Corsi inverse tasks.

Table 2. Stimuli of working memory tasks.

Number of the square that changed in each of the sequences displayed in each trial of the Corsi and Corsi inverse tasks and the numbers played in each trial of the digits span and digit span inverse tasks. For the graphical display of the position of each square, see Fig 1.

For the digit span and the inverse digit span tasks, pre-recorded sequences of digits were presented to the participants. See Table 2 for a description of said sequences and the digits used in them. There were 8 blocks in total in the experiment, each of them containing two trials. In each trial, the participants were presented with a fixation point in the centre of the screen while sequences of numbers were presented auditorily, previously recorded by a native Basque-Spanish speaker. Participants had to listen to the series of numbers that were presented at an approximate rate of one digit per second, and once each sequence finished, they were asked to repeat the numbers of the sequence out loud. For the digit span, this repetition had to be done in the same order as they listened to the numbers. For the digit inverse span task, they had to repeat them in the inverse order. The difficulty increased with each block, as the sequence length increased with each of them: the sequences of the first block consisted of two numbers, the sequences of the second block consisted of three numbers, and so on until the 8th block, in which the sequences consisted of 9 numbers (see Table 2). After the two trials of each block, the experimenter indicated the amount of correctly retrieved sequences in that block. If the participant failed both of the trials, the experiment finished. If the participant repeated accurately one or two items, the experiment continued (see Fig 3 for a schematic representation of the experiment and see Table 2 for details on the trials).

Fig 3. Schematic representation of the digits span and digits span inverse tasks.

Data analysis

The data from the Stroop task needed a different pre-processing before analysing. Audios were equalized to a 63dB amplitude using Praat[89]. Once all the files had same amplitude level, the voice onset was automatically detected by Praat as follows: the textgrid of each audio was divided into “sound” and “silence” segments using the silence function from Praat. For a segment to be considered “sound” it had to have a minimum pitch of 100 Hz, to have exceeded a -25dB threshold and to have lasted at least 100ms. “Silence” segments had to last at least 200ms. The starting time point of the first sound segment was considered the onset of the speech and therefore, the reaction time of that response. The accuracy of the responses was checked manually, and the speech onset was manually adapted in the cases in which subjects corrected themselves (e.g., “roj…amarillo”) and mistakes were removed.

For the four EF tasks, after removing errors, latencies were trimmed for outliers by deleting any response that deviated in more than 2SD from the mean in each condition. After this, a 3x2 ANOVA was run with Condition (congruent, incongruent, neutral) and Language Group (bilinguals, monolinguals). To further check for any possible advantage, the conflict index (i.e., incongruent-congruent latencies), the incongruity index (i.e., the neutral condition compared to the incongruent condition), and the congruency index (the congruent condition compared to the neutral one) were compared across language groups using ANOVAs and Bayesian t-tests (calculated using JASP,[90]). The removed error rates were separately analysed following the same procedure, using ANOVAs and Bayesian t-tests. For the analysis of the working memory tasks, tasks 5 to 8, the mean number of trials remembered (out of 16, as there were two trials in each of the 8 blocks) was compared between groups using t-tests and Bayesian t-tests.


In the reaction times analysis for the flanker task, after removing 5.15% of the data due to outliers, we observed a strong main effect of Condition (F(2, 356) = 196.15, p < .01), and a more detailed analysis indicated that congruent items were responded faster than the incongruent ones (t(179) = 16. 69, p < .01), and also neutral items were responded faster than both incongruent (t(179) = 15.98, p < .01) and congruent ones (t(179) = 2.48, p < .02) . There was no main effect of Language Group (F(1, 178) = 0.60, p>.44) and no interaction between the two main effects (F(2, 356) = 0.01, p>0.99). The conflict index analysis showed a strong Condition effect (F(1,178) = 279.92, p < .01), but no other effect or interaction were significant (Fs<1). Results from the Bayes Factor t-test comparison across groups (BF01 = 6.14) indicated that the null hypothesis explains the data 6 times better than the alternative hypothesis of bilinguals showing a reduced conflict index (see Table 3 for descriptive results). The incongruity index showed the same pattern as the conflict index, showing a strong Condition effect (F(1, 178) = 253.96, p < .01) but no other significant results (Fs<1), and the Bayesian Factor analysis favoured the null hypothesis (BF01 = 6.14). The congruency index also showed a significant effect of Condition (F(1, 178) = 6.14, p < .02) but no other significant effect or interaction (Fs<1). Bayesian analysis also favoured the null hypothesis over the alternative one (BF01 = 6.19).

Table 3. Flanker task.

Mean reaction times (in milliseconds) and error rates (in percentages) for each condition and index are displayed together with standard deviations (between parentheses).

The analysis of the error rates showed a similar pattern. A strong and significant Condition effect was found (F(2, 356) = 34.39, p < .01), stemming from incongruent trials producing more errors than the ones belonging to congruent condition (t(179) = 6.66, p < .01) and to neutral trials (t(179) = 5.79, p < .01); but no differences were found when congruent and neutral conditions were compared (t(179) = 1.00, p>.32). Importantly, no main effect of Language Group or an interaction between it and Condition were found (all Fs<1). When the conflict index was computed and compared between groups, the effect of Condition was significant (F(1,178) = 44.06, p < .01), but no other main effect or modulation was (Fs<1). Expectedly, the Bayes Factor t-test analysis (BF01 = 6.07) supported the null-differences hypothesis. Similarly to what it was found in the RTs analysis, the incongruity index was significant (F(1, 178) = 33.35, p < .01) but no other effect or interaction was (Fs<1), as confirmed by the Bayesian Factor analysis (BF01 = 6.16). Congruency index analysis, however, showed no main effect or interaction (all Fs<1), but still the index was compared across groups and the null hypothesis was the best fit for the data (BF01 = 6.08).

The general ANOVA conducted on the reaction times to correct responses–after removing the outliers (4.82% of the data)–in the Simon task revealed a significant main effect of Condition (F(2, 356) = 33.14, p < .01). Post-hoc comparisons showed that incongruent trials were responded slower than both congruent (t(179) = 8.06, p < .01) and neutral trials (t(179) = 5.55, p < .01), and that congruent trials were responded faster than neutral ones (t(179) = 2.17, p < .04). However, nor main effect of Language Group (F(1, 178) = 2.66, p>.11) neither the interaction between them (F(2, 356) = 0.11, p>.89) was significant (see Table 4 for descriptive results). In the conflict index analysis, Condition effect was significant (F(1, 178) = 64.63, p < .01), but no main effect of Language Group (F(1, 178) = 2.64, p>.11) or an interaction was found (F<1), which indicates that there are no significant differences in the conflict index when it is compared across groups (BF01 = 5.85). The incongruity index showed a strong Condition effect (F(1, 178) = 30.66, p < .01), but Language Group did not (F(1,178) = 2.48, p>.12) and neither did it the interaction between them (F<1). The Bayes Factor analysis supported that the incongruity index was similar in both groups (BF01 = 5.65). In a similar pattern, the congruency index was significant (F(1, 178) = 4.67, p < .42) but it did not interact with Language Group (F<1). Language Group did not result significant either (F(1, 178) = 2.72, p>.10), and Bayes Factor analysis supported the similarity of the index across language groups (BF01 = 6.15).

Table 4. Simon task.

Mean reaction times (in milliseconds) and error rates (in percentages) for each condition and index are displayed together with standard deviations (between parentheses).

When the error rates were analysed, a similar picture emerged. Condition was significant (F(2, 356) = 13.7, p < .01), and paired comparisons revealed that it was due to the incongruent trials producing more errors than both congruent (t(179) = 4.05, p < .01) and neutral ones (t(179) = 4.58, p < .01), with no difference between these last two (t<1). No effect of Language Group or an interaction between it and Condition were found (all Fs<1). Crucially, the conflict index was significant (F(1,178) = 16.35, p < .01) but it did not interact with Language Group, and Language Group did not result significant either (all Fs<1). The Bayes Factor analysis (BF01 = 5.74) indicated that the null-hypothesis was almost 6 times more likely to explain the data. Similarly, the incongruity effect was significant (F(1,178) = 20.88, p < .01) but neither language Group nor the interaction between it and Condition was (all Fs<1). The Bayes Factor analysis indicated that the index was highly similar across language groups (BF01 = 6.19). The analysis of the congruity effect revealed that neither Condition, nor Language group nor the interaction between them was significant (all Fs<1.3, all ps>.26).

After removing the outliers from the Stroop task (4.84%), a general ANOVA on the reaction times to correct responses showed a main effect of Condition (F(2, 356) = 279.22, p < .01), which showed that congruent condition was responded on average faster than neutral trials (t(179) = 10.98, p < .01) and than incongruent trials (t(179) = 21.32, p < .01). Neutral condition was also responded faster than incongruent condition (t(179) = 13.80, p < .01). Crucially, we observed no main effect of Language Group (F(1, 178) = 1.53, p>.22) and no interaction between it and Condition (F(2, 356) = 0.40, p>0.67). The Stroop index analysis indicated a strong effect of Condition (F(1, 178) = 452.41, p < .01), but Language Group was not significant (F(1, 178) = 1.3, p>.29) and, crucially, it did not interact with Condition (F<1). Importantly, the analysis of the Bayes Factor (BF01 = 5.62) indicated that there are no significant differences between groups and that the null hypothesis is the most likely one to explain these data (see Table 5 for descriptive results). The incongruity index analysis showed a main effect of Condition (F(1,178) = 189.59, p < .01) but negligible main effect of Language Group (F(1,178) = 1.77, p>.18) and, importantly, no modulation of Condition by Language Group (F<1). This null difference between groups was again supported by the Bayesian t-test (BF01 = 5.79). The analysis of the congruency index showed a significant effect of Condition (F(1,178) = 120.32, p < .01) but it was not modulated by the knowledge of a second language (F(1,178) = 1.06, p < .30), which was supported by the Bayesian t-test (BF01 = 3.78). Similarly, no main effect of Language Group was found (F(1,178) = 1.62, p>.21).

Table 5. Verbal Stroop task.

Mean reaction times (in milliseconds) and error rates (in percentages) for each condition and index are displayed together with standard deviations (between parentheses).

The error rate analysis showed a strong and significant Condition effect (F(2, 356) = 10.24, p < .01), indicating that more errors were made in the items belonging to the incongruent condition than in the ones belonging to the congruent condition (t(179) = 3.52, p < .01) and to the neutral one (t(179) = 3.07, p < .01); but no effect of Language Group was found (F(1,178) = 0.87, p>.35) nor an interaction between Language Group and Condition (F(2, 356) = 1.67, p>0.19). The Stroop index was computed and compared between groups, which showed a strong main effect of Condition (F(1,178) = 12.43, p < .01), but it was not modulated by Language Group (F(1,178) = 1.69, p>.2), and Language Groups did not differ either (F(1,178) = 1.35, p>.25). Furthermore, the Bayes Factor analysis (BF01 = 2.83) supported the null-differences hypothesis. The incongruity index was significant (F(1,178) = 9.45, p < .01), but Language Group did not modulate it (F(1,178) = 2.06, p>.15) and neither a main effect of Language Group was observed (F<1). The Bayes factor comparison also tended to support the null hypothesis as the best fitting candidate (BF01 = 2.38). In the congruency index analysis, Condition was not significant (F(1,175) = 2.67, p>.1) and neither was the effect of Language Group or the interaction between the two factors (Fs<1).

In numerical Stroop task, the reaction times to the correct responses were analysed after removing of outliers (4.75%). The general ANOVA showed a main effect of Condition (F(2, 356) = 202.38, p < .01), indicating that congruent trials were responded faster than both the incongruent (t(179) = 16.37, p < .01) and the neutral ones (t(179) = 6.04, p < .01), and that neutral items were also responded faster than the incongruent ones (t(179) = 13.80, p < .01). Crucially we found no main effect of Language Group (F(1, 178) = 2.61, p>.11) nor interaction between it and Condition (F(2, 356) = 0.40, p>0.67). The Stroop index was significant (F(1,178) = 268.63, p < .01) but Language Group (F(1,178) = 2.95, p>.09) was not. The lack of interaction between them (F(1,178) = 1.44, p>.23) indicated that the linguistic profile did not have any reliable impact on the magnitude of the Stroop effect (BF01 = 3.18) (see Table 6 for descriptive results). The incongruity index analysis showed a significant Condition effect (F(1,178) = 191.70, p < .01), but Language Group was not significant (F(1,178) = 2.62, p>.11) and neither it was the interaction between them (F(1,178) = 2.23, p>.14), which was supported by the tendency showed by the Bayes Factor analysis (BF01 = 2.2). The congruency index was strong (F(1,178) = 32.25, p < .01), but Language effect was not significant (F(1,178) = 2.19, p>.14), neither was the interaction between them (F<1). The null hypothesis was also supported by the Bayes Factor analysis (BF01 = 5.89).

Table 6. Numerical Stroop task.

Mean reaction times (in milliseconds) and error rates (in percentages) for each condition and index are displayed together with standard deviations (between parentheses).

The error rate analysis also showed a significant Condition effect (F(2, 356) = 40.13, p < .01), which was a reflection of incongruent trials producing more errors than both congruent (t(179) = 6.37, p < .01) and neutral (t(179) = 6.89, p < .01) ones, but no differences were found between congruent and neutral items (t<1). Importantly, we observed no effect of Language Group or an interaction between that factor and Condition (all Fs<1). The Stroop index was compared between groups, and it revealed a main effect of Condition (F(1,178) = 40.45, p < .01), but no other effects were significant (all Fs<1). Bayes Factor analysis (BF01 = 5.15) indicated that both groups behaved similarly. Similarly, the incongruity effect analysis showed a Condition effect (F(1,178) = 47.34, p < .01), but no main effect of Language or interaction was found (all Fs<1). Again, Bayes Factor analysis indicated no differences between Language Groups (BF01 = 5.49). Finally, the congruency index analysis showed no significant effect or interaction (all Fs<1), and a further Bayes Factor analysis also indicated that the index did not differ across groups (BF01 = 5.88).

For the analysis of the working memory tasks, the amount of trials remembered out of 16 (2 trials in each of the 8 blocks) were compared between groups in each of the tasks (see Table 7 for descriptive results). In the Corsi task Bilinguals remembered an average of 9.92 trials (SD = 1.97) and monolinguals an average of 9.84 (SD = 2.33), but the difference between the two groups was non-significant (t(178) = 0.24, p>.81), and null hypothesis was also supported by the Bayesian Factor analysis (BF01 = 6.02). In the inverse Corsi task bilinguals remembered an average of 8.91 items (SD = 1.67) and monolinguals an average of 7.98 (SD = 1.95), and the difference between the two groups was significant (t(178) = 3.45, p<0.01). The alternative hypothesis was also supported by the Bayesian Factor analysis (BF01 = 0.03). In the digit span, bilinguals remembered an average of 8.89 trials (SD = 2.16) and monolinguals an average of 8.6 (SD = 1.92), but the difference between the two groups was non-significant (t(178) = 0.95, p>.35), and the null hypothesis was also supported by the Bayesian Factor analysis (BF01 = 4.08. Finally, in the inverse digit span, bilinguals remembered an average of 8.97 items (SD = 1.84) and monolinguals an average of 7.94 (SD = 1.84), and the differences between the two groups was significant (t(178) = 3.72, p<0.01). The Bayesian Factor analysis supported the alternative hypothesis (BF01 = 0.01). The results in the Corsi and digit span asks were also analysed using the highest recalled set size. Differences are not significant in the Corsi task (p>.43, Bilinguals recalled an average set size of 6.48, and monolinguals of 6.61). For the Corsi inverse, the recalled set size was significantly different for bilinguals (6.07 items on average) and monolinguals (5.67 items on average; p < .02). Following the same pattern, the recalled set size was not significantly different between groups in the digit span task (p>.59, bilinguals recalled an average size of 6.91 and monolinguals 6.82), but it was different in the backwards span task (p < .01), with bilinguals recalling slightly larger sets (6.92) than monolinguals (6.51).

Table 7. Results of working memory tasks.

Mean number of items remembered and standard deviations (between parentheses) are displayed for each of the task of the working memory set of experiments.

Additional analyses

The cross-task coherence was tested in a correlation analysis between the indices in milliseconds (congruency, incongruity and conflict/Stroop) obtained in the first four tasks [30,68]. The Stroop/conflict effects across tasks showed negligible correlation strength (all rs between -.06 and .11). The congruency effect showed a significant yet low negative correlation between the flanker task and the numerical Stroop task (r = -.21) and between the Simon and the Stroop task (r = -.17), and low and positive correlation between Simon and Numerical Stroop task (r = .18, rest of rs between -.08 and .05). Finally, analyses on incongruity effects showed a significant but negative correlation between the flanker and the Simon tasks (r = -.18,) and all the rest pairs of effects indicated that the cross-task coherence was very low (all rs between -.11 and .17). Thus, this analysis consistently showed a poor level of cross-task correlation.

Furthermore, to leverage our sample size and the extensive sociodemographic data in order to better understand how often unmatched socio-demographic factors or WM skills co-occur with significant bilingual advantage in EF tasks (i.e., the first four tasks), we conducted a systematic bootstrapping study. Firstly, we randomly sampled subsets of 25, 50, and 75 participants, 1000 times for each sample size, and measured how often the EF were significantly different between groups. Then, we explored how often the sociodemographic variables differed significantly in those samples where the EF differences were found.

With 1000 random samples of 25 participants, we only found a significant difference between groups in the flanker task in 2.1% of the samples, in 1.5% of the samples in the Simon task, in 7.3% of the samples in the numerical Stroop task and in 3.1% of the samples in the Stroop task. Out of those samples that showed a significant difference between groups, 9.46% showed a significant difference between group in Age (all of them produced by significantly older bilinguals), 4.05% of them showed a significant difference in IQ (83.33% of them produced by bilinguals having significantly higher IQ scores), and 25.68% showed significant differences in SES as measured by monthly incomes divided by household members (all of the significant cases were due to bilinguals scoring higher). Crucially, in 40.54% and 51.5% of these subsamples we also found a better performance for bilinguals in the inverse Corsi and inverse digit memory tasks respectively. Regarding the EF tasks, 12.16% of the subsamples showed significant differences in the Flanker task (42.86% showing a bilingual advantage, and 57.14% showing a bilingual disadvantage), 12.16% of them showed significant differences in the Simon task (73.33% of them showing a bilingual advantage, 26.67% showing a bilingual disadvantage), 60.81% showed significant differences in the numerical Stroop (all of them indicating a bilingual advantage) and 22.97% showed significant differences in Stroop (83.87% indicating a monolingual advantage, 16.13% indicating a bilingual advantage).

With 1000 random samples of 50 participants, there were significant between group differences in the flanker task in 0.2% of the samples, in the Simon task in 0.3% of the samples, in the numerical Stroop task in 6.5% of the samples and in the Stroop task in 0.5% of the samples. Exploring only the samples that showed significant differences, 10.81% of them showed significant Age differences (all of them produced by significantly older bilinguals), 6.76% of them showed a significant difference in IQ (all of them due to bilinguals having significantly higher IQ scores), and 51.35% showed significant differences in SES (all of the significant cases were due to bilinguals scoring higher). Importantly, in 89.19% and 87.84% of these subsamples bilinguals outperformed monolinguals in the inverse Corsi and inverse digit span tasks. When it comes to the EF tasks, 2.70% of these subsamples showed significant differences in the Flanker task (all of them indicating a monolingual advantage), 4.05% of them showed significant differences in the Simon task (66.67% of them showing a bilingual advantage, 33.33% showing a bilingual disadvantage), 87.84% showed significant differences in the numerical Stroop (all of them indicating a bilingual advantage) and 6.76% showed significant differences in Stroop (all of them indicating a monolingual advantage).

With random samples set to 75 participants, and thus closer to our actual number of participants, there was no sample comparison that showed significant differences between groups in the flanker, the Simon or the Stroop tasks. In 2.6% of the cases a significant difference in the Numerical Stroop task was found, where bilinguals performed better than monolinguals. In those cases, 3.85% of the cases showed significant IQ differences (all of them produced by bilinguals performing significantly better), and 88.46% showed significant differences in SES (all of the significant cases were due to bilinguals scoring higher), and the 100% showed a better performance of bilinguals in the inverse Corsi and inverse digit span tasks.

Additionally, systematic multiple regression analyses were conducted to try to capture any possible influence of the sociodemographic factors in the indices obtained in each of the tasks. First, three models were built for each of the four EF tasks task. The Stroop or conflict effect (in milliseconds) was used as the dependent variable, and Age, IQ score (correct responses in the abridged version of the K-Bit task used in the experiment), SES (the value resulting from dividing monthly income by household members) and LexTale score in Spanish were included as independent variables in Model 1. Model 2 also included Group (Bilinguals and Monolinguals) and Model 3 included the interaction terms between Group and the rest of the demographic and linguistic factors. None of the models explained enough of the variability in the data (all R2 < .06, all adjusted R2< .02) and none of them reached significance (all ps> .17). Importantly, in none of those models was Group a significant predictor (all ps> .3) nor was a significant interaction with the rest of the sociodemographic factors (all ps>.17).

The same regression approach was used to explore the possible influence of the sociodemographic factors in the WM scores. The number of correctly recalled items was used as the dependent variable, and Age, IQ, SES and LexTale scores in Spanish were included as independent variables in Model 1. Model 2 also included Group (Bilinguals and Monolinguals) and Model 3 included the interaction terms between Group and the rest of the demographic and linguistic factors. For the Corsi task, Model 1 resulted in a significant regression equation (F(4,175) = 2.98, p < .03), with a R2 of .06 and an adjusted R2 of .04. Model 2 was significant as well (F(5,174) = 2.40, p < .04) with a R2 of .06 and an adjusted R2 of .04, but did not significantly improve the first model (p>.74). Model 3 was not significant (p>12) and did not improve the model either (p>.70). Thus, following Model 1, participants’ predicted Corsi score is equal to 11.75–0.10 (age) + 0.16 (IQ). While IQ was a significant predictor (p < .01) and Age was marginally significant (p < .06), SES and Lextale scores were not (all ps>.30). Group and interaction between Group and other factors were not significant predictors in any of the models (all ps>.29). For the Corsi inverse task, Model 1 was a significant regression equation (F(4,175) = 3.01, p < .03), with a R2 of .06 and an adjusted R2 of .04. Model 2 was significant as well (F(5,174) = 4.51, p < .01) with a R2 of .12 and an adjusted R2 of .09, and significantly improved the first model (p < .01). Model 3 was also significant (p < .01) but did not improve the model (p>.74). Thus, following Model 2, participants’ predicted Corsi inverse score is equal to 2.22 + 0.09 (IQ) +0.06 (LexTale) + .85 (Group). Participants’ recalled inverse digit items increased 0.09 for every increase in score in the IQ test, 0.06 for every increase in score in the Spanish LexTale task, and bilinguals recalled .85 items more than monolinguals. Score in LexTale was a significant predictor (p < .05), as well as Group (p < .01), and IQ was marginally significant (p < .08). SES and Age were not (all ps>.70). The digit span task followed a similar pattern as the one showed by the Corsi task: Model 1 was a significant regression equation (F(4,175) = 3.18, p < .02), with a R2 of .07 and an adjusted R2 of .05. The inclusion of Group (Model 2) produced a significant model (F(5,174) = 2.73, p < .04) with a R2 of .07 and an adjusted R2 of .04, but did not significantly improve the first model (p>.33). Model 3 was significant as well (F(5,174) = 2.07, p < .04) but did not improve the model either (p>.30). Thus, following Model 1, participants’ predicted Digit Span score is equal to 6.18–0.12 (age) + 0.12 (IQ). IQ and Age were significant predictors (ps < .03), SES and Lextale were not (all ps>.16). Group and interaction between Group and other factors were not significant predictors in any of the models (all ps>.12). Resembling what was found for the Corsi inverse task, the analysis for the inverse digit span task showed that Model 1 resulted in a significant regression equation (F(4,175) = 5.07, p < .01), with a R2 of .10 and an adjusted R2 of .08. Model 2 was significant as well (F(5,174) = 6.90, p < .01) with a R2 of .17 and an adjusted R2 of .14, and significantly improved the first model (p < .01). Model 3 was also significant (p < .01) but did not improve the model (p>.54). Thus, according to Model 2, participants’ predicted Corsi inverse score is equal to 2.38 - .10 (Age) + 0.14 (IQ) +0.05 (LexTale) + 0.96 (Group). Participants’ recalled inverse digit items decreased for .1 for every older year, increased 0.14 for every increase in score in the IQ test, 0.05 for every increase in score in the Spanish LexTale task, and bilinguals recalled .96 items more than monolinguals. As it can be seen, the significant predictors were Age (p < .03), IQ (p < .01), and Group (p < .01), and Lextale was marginally significant (p < .07). SES was not (all ps>.17). In this second set of tasks, i.e. the WM tasks, we did not conduct a detailed bootstrapping analysis, because during it, we observed that the majority of the subsamples showed a bilingual advantage in the inverse versions of the WM tasks (>42% of the 25 sample subsets, >83% in the 50 sample subsets, and >99% in the 75 sample subset), and thus the co-occurrence with other factors would not be very informative.

General discussion

This study aimed at exploring the potential effects of bilingualism on two main processes: EF and WM. To the best of our knowledge, this is the first study in which large samples of bilinguals and monolinguals are extensively tested using multiple tasks to asses both their EF abilities (tasks 1 to 4) and WM span (tasks 5 to 8) while relevant demographic factors (age, IQ, SES, educational level and immigrant status) are controlled for.

The first hypothesis put to test was the enhancement of EF as a consequence of bilingualism. In tasks 1 to 4 we attempted at verifying the reliability of the bilingual advantage hypothesis by using the same tasks and equivalent populations that the previous studies did [14,20,44,45] but attempting to account for the concerns raised by the criticisms to these studies [24,30,31]. In that regard, the predictions were rather straightforward. If bilingualism provides an advantage in EF independently of the effects of the controlled factors, bilinguals would have shown a reduced conflict or Stroop effects when compared to monolinguals [8] or faster global reaction times [27]. The flanker, Simon, Stroop and numerical Stroop tasks produced the expected classic patterns, with strong and constant conflict effects in all of them, mainly driven by the incongruity effect. Each condition (incongruent, congruent and neutral) behaved as expected and in accordance with preceding literature. However, none of the effects or conditions varied significantly across language groups, and language groups did not overall differ in reaction time either. Furthermore, the Bayesian factor analysis clearly showed that the null hypothesis was the most suited explanation for the results we obtained: bilinguals and monolinguals did not differ as to how they face the demands of changing tasks with congruent and incongruent trials. Importantly, the results obtained using bootstrapping analyses shed additional light on the role of the uncontrolled socio-demographic factors, as the analyses indicated that the bilingual advantage in random samples was much more frequent in small rather than in big sample sizes, and also that the advantage in EF tasks co-occurred mostly with significant differences in other factors, and especially in SES. The impact of SES, both together with and independent of bilingualism, has been explored in many studies, especially in children samples [33,9193]. For example, Hartanto, Toh, and Yang [94], reported that both high levels of SES and bilingualism correlated with better EF in children, but only SES was a reliable predictor of verbal WM. Importantly, they found that bilingualism predicted advantages in EF tasks in low SES groups only. Altogether, these results provide credibility to the concerns that the bilingual advantage obtained previously in these tasks might be found as a consequence of unmatched external factors, rather than by bilingualism itself [30], and that it disappears when the confounding factors are controlled for [24,31,4649]. Along the same lines, see a very recent meta-analysis of the effect sizes found in 152 different studies, where no strong support of the bilingual advantage in conflict monitoring, inhibition or WM is found [53].

Far from ending, though, the debate around the bilingual advantage feeds from growing evidence pointing towards both directions. Our data here, together with the recent findings of no bilingual advantage in Basque-Spanish bilingual children [46,47] and seniors [49] addresses whether this advantage in EF appears in truly bilingual speakers in a truly bilingual community. And the answer this far, although it can only be extrapolated to comparable populations and societies, is a robust no. The conclusions drawn from the analyses conducted in the present article can only be circumscribed to a very specific kind of bilingual population, yet as important as any other–balanced and native bilinguals immersed in a bilingual society, but we believe that they are of crucial importance to better understand and reframe the current perspectives on the bilingual advantage debate. In the Basque society, the ratio of usage of Basque and Spanish differs between age groups, regions and social spheres, being on average 20% of the citizens older than 16 years who use Basque as much as or more than Spanish (according to Basque Institute of Statistics, Eustat), and thus creating an heterogeneous linguistic mosaic in which language use varies widely across social groups.

Therefore, we wonder whether the use of the EF that bilingualism asks for is strong enough to provoke changes at the behavioural level. The argument for a bilingual advantage on executive control tasks rests on the idea that monolinguals do not switch between two languages, since they only have one available. However, all human beings face situations in which they have to inhibit salient responses constantly and monitor the environment, in both general social situations and when performing concrete actions. For example, people do switch between comprehension and production when they talk to somebody, they do switch and keep their monitoring abilities strongly activated when they have to drive and talk to somebody, or they inhibit salient responses when they have to adapt their speech and manners to different social situations, which can range from casual to very formal. Thus, monolinguals also efficiently use their switching, inhibitory and monitoring skills, and it is unclear whether language switching in bilinguals imposes a heavier burden than the one imposed to everyone, monolingual or bilingual, in their daily life. As an indicative example, even studies comparing interpreters—who are experts in extreme language control due to their frequent switching demands–and monolinguals, fail to find any evidence of bilingual advantage in EF [95]. In essence, the main perspective on the bilingual advantage hypothesis follows the assumption that language control and general executive control functions are two completely overlapping mechanisms (e.g. “Crucially, the mechanism that reduces attention to the non-relevant language system is the same as that used to manage attention in all cognitive tasks”, [15], p.41), that EF are domain-general and that they apply to every situation in which they are needed, linguistic or not. Hence, training in one concrete aspect directly implies an improvement in any other context where the same EF are needed. However, a different interpretation comes from questioning the roots of the advantage: what if the EF were not as domain general as they have been claimed to be? If they were, the performance in the four EF tasks used in this study should positively correlate with each other, inasmuch as they are supposed to reflect the same general ability. Our results clearly show that they do not, indicating different underlying mechanisms (in this same regard, see [30,68]). However, these correlations should be interpreted with caution. A recent study by Hedge, Powell and Sumner [96] explains in detail that many classical psychological tasks (like the ones typically used in the bilingual advantage literature) produce robust and replicable effects due to their low between-participant variability. Because of that, those tasks also feature small reliability for individual differences–measured as the ability of a task to consistently rank individuals at two or more time points–, showing a low correlation with themselves at different time points. Therefore, it’s not surprising that cross-task correlation values are also low. Furthermore, Friedman and Miyake [97] addressed the “task impurity” problem in the field of EF. They argued that the tasks used to measure inhibition always measure the inhibition of something (e.g., vocal responses, hand responses, etcetera), and processes other than pure inhibition are always involved in the performance. As a consequence, low scores in such tasks do not necessarily involve low inhibitory abilities, and low correlational values between tasks do not necessarily mean that they tap into different inhibitory abilities, but may reflect individual variations in other relevant factors. However, it is important to note that the degree of domain-specificity of the different EF components—especially switching and inhibition—has been recently questioned and tested using not only behavioural but also neuroimaging measures, and results tend to indicate that the domain-generality assumption is debatable at best [98101]. Behaviourally, the performances in linguistic and non-linguistic switching tasks do not correlate with each other (e.g.,[98,99,101]), similarly to linguistic and non-linguistic inhibition tasks (such as the n-2 task, see [102]). From the neural point of view, there is a strong overlap in the brain areas responsible for linguistic and non-linguistic switching, but whether or not domain-general inhibition and language control respond to the same brain mechanisms is still unclear [98,100,103,104]. Furthermore, and even in the very same field of language control, the most recent studies speak of different language control mechanisms relying on different neural substrates when applied to language comprehension and production [105]. If the domain-specificity of the EF is true, it would invalidate the training transfer assumption that the bilingual advantage hypothesis is based on.

On the other hand, the second hypothesis tested in the present article predicted a potential bilingual advantage in WM skills. Following Baddeley’s model of WM [55], the results in the Corsi task would be an index of participants’ visuo-spatial sketchpad, while the digit span would reflect participants’ phonological loop system. We observed that bilinguals outperformed monolinguals in the backward versions of both the Corsi and the digit span tasks, with no differences in the forward versions. As opposed to the EF, this cognitive ability might have a stronger domain-general component: even though some authors have argued for separate WM stores and mechanisms for different sensory domains (domain-specific WM, [55,106]), others defend that the maintenance system that retains the stimuli is unitary (the domain-general perspective, [65,107,108]) despite the existence of domain-specific stores of the WM. Using neuroimaging techniques, some authors found different brain regions involved when processing the stimuli from different domains [109111] while some others found the same region involved in memory maintenance no matter the domain [112115]. Trying to solve this issue, Li et al. [116] found functional networks responsible for domain-general and domain-specific processes in WM. Interestingly, while specific networks showed an important role only during encoding, domain-general networks showed load-dependent patterns during encoding, maintenance and retrieval. Importantly, in our data (tasks 5–8), the differences were found only in tasks that involved a more complex processing and retrieval (transforming the encoded information to the backwards series) of the information stored, i.e. in the backward conditions (see also [62]; for a bilingual advantage in more demanding memory tasks but not in simple ones; and [117]; for results showing a bilingual advantage in inverse digit span tasks). Precisely, the situations in which the domain general WM system would be required [116]. Unlike the domain-specific networks, not susceptible to training transfer, domain-general WM abilities are capable of improvement via enhancement of some different domain (like bilingualism), and the existence of a transfer is worth considering as it has been shown that training can improve WM (see [5659], but see also [60], for evidence against the beneficial effects of training in WM). An alternative potential explanation relies on the argument of bilinguals having a larger combined vocabulary size in both of the languages than monolinguals do in their only language [118]. As bilinguals seem to rely more than monolinguals on short-term memory resources for word retrieval [119], it could indeed behave as a training that transfers to other situations in which WM abilities are required, thus explaining why bilinguals and monolinguals differ in WM abilities.

An interesting twist regarding the source of enhancement comes from the fact that both EF and WM are strongly related [120], and differences in EF tend to correlate with differences in WM [65], especially in demanding WM tasks that require storing and processing of information [66]. Specifically, WM has been linked to the updating component of the EF (some argue they are related but separable [68], others argue they can be equated [67]). Some authors have argued that it is at the WM level that the bilingual advantage generates, and then, due to its close relation with monitoring, this advantage is shown in EF tasks [63]. Along the same lines, training WM abilities has been shown to improve EF skills as well [121]. On the other hand, other authors argue that the bilingual advantage in EF stems from improved updating (i.e., monitoring) abilities (captured in the classic EF tasks as faster overall reaction times [27]), which could be then translated to an indirect improvement of WM, given the close relation between updating and WM. For example, Morales, Calvo and Bialystok [62] report a bilingual advantage in WM tasks only when the EF demands imposed by the task are high, and therefore they argue that it is the role of EF that improved bilinguals’ performance in WM (along the same lines, see [117]. However, note again that the validity of these results is put to question by the lack of control of several factors, such as ethnicity (the group of bilinguals is formed of individuals with more than 15 different second languages, indicating significant linguistic and probably ethnical differences) or SES (just parents’ educational level is reported, and very scarcely). To our understanding, the results in the present study picture the opposite: the performance in the EF tasks was similar for bilinguals and monolinguals, indicating no bilingual advantage for our sample of native balanced bilinguals immersed in a bilingual society. On the other hand, we consistently found a bilingual advantage in the backward WM tasks where information has to be actively processed and retrieved in a complex way. In principle, the absence of an advantage in EF could be arguably due to the ceiling effect that adults of this age feature in EF abilities [122] that prevents any potential enhancement in EF from being captured. However, when random resampling analyses were conducted for a thousand times for each different sample sizes, the bilingual advantage was found in some variable percentages of the cases, so there was still room for differences. Interestingly, the sets that showed a bilingual advantage also displayed an advantage in backward memory tasks in the majority of the cases, as well as other unmatched factors such as SES. This dissonance makes us hypothesize that bilingualism does improve WM abilities, and then this can–but does not necessarily have to–translate into an enhancement of EF abilities when interacting with other factors, but not the other way around. Had the advantage in memory been a consequence of EF functions, the tasks employed to measure said functions (i.e., tasks 1–4) should have shown an advantage in some of the dimensions as well. Furthermore, our findings show no general advantage in monitoring–i.e., faster RTs–but better WM abilities. This supports the idea that these two constructs, even though often equated [67], might overlap but are different [68]. It also strengthens the argument made by Namazi & Thordardottir [63], who found that bilingual and monolingual children were equally successful at dealing with the Simon task, but those with better visual WM scores performed better in the task. They concluded that the advantage they found was related to children’s WM memory abilities, rather than bilingualism, and highlighted that WM should held constant while investigating the effects of bilingualism in EF abilities.

All in all, the results obtained from the 8 tasks conducted in the present study show a very stable pattern. Firstly, native and balanced bilingualism does not improve bilinguals’ general EF abilities when compared to carefully matched monolingual counterparts. Secondly, bilingualism improves WM when the task requires an active and complex processing and retrieval of the encoded information. This pattern is interpreted as a consequence of the domain-specificity of the EF and the encoding processes of WM, and is thus not susceptible to be indirectly trained. The load-dependence of the maintenance and retrieval of the encoded information in WM tasks has been shown to be domain general. This makes the backward conditions of the memory tasks suitable for improvement due to training transfer. Despite the information provided by the lack of correlation between the EF tasks, we did not collect any data testing other aspects of EF abilities, and therefore this interpretation of the results is rather speculative.

As a general consideration, it should be born in mind that the conclusions derived from this study are generalizable only to the populations and situations similar to the ones tested here, that is, lifelong, native and balanced bilinguals (in particular the case of Basque-Spanish bilinguals, see [46,47,49]). When different bilingual profiles are considered, the same patterns are not completely guaranteed. For example, the factors of immigration and late bilingualism should be specially considered for future debates and research. Immigration usually involves moving to a different language-speaking country and it forces people to become bilingual, so it often co-occurs with late bilingualism and both factors could be confounded when the significant effects of bilingualism are explored (see [13,14,28,44,123,124], among others, for studies reporting bilingual advantages that tested bilingual samples formed by mostly immigrant individuals). Immigrants, who generally happen to be bilinguals, could show some enhancements maybe wrongly associated to bilingualism when compared to non-immigrants, who happen to be monolinguals. It is still unclear whether those effects could be purely produced by a late bilingualism, by being an immigrant, or a combination of both, and therefore future research addressing those hypotheses is needed, to clarify the scenario regarding the impact of different kinds of bilingualism and sociodemographic factors in EF.


The results from this set of tasks suggest that bilingualism is not enough to enhance young bilingual adults’ EF skills relative to the ones of young monolingual adults. Both groups behaved similarly in all the tasks, as measured by the different indices and conditions which, importantly, did not correlate across tasks. Thus, it is argued that EFs are not as domain general as they were believed to be, and therefore the hypothesis of a training transfer produced by bilingualism is not supported. The results of the bootstrapping analysis indicate that when the bilingual advantage in EF is found, it very often co-occurs with significant differences in socio-demographic factors and memory abilities, suggesting that previous findings might have been a consequence of unmatched factors.

From the results from the WM tasks we clearly see that there was no effect of bilingualism in the easiest versions of the tasks (i.e., forward versions where only storing and repeating is needed) but it does improve the WM skills required in backward tasks where storing, manipulation and retrieval are used. We interpreted this selective bilingual advantage as based also in the domain-specificity of some abilities. Previous findings have shown that encoding relays on domain-specific WM, and therefore no training transfer would be expected. However, the backward task has a stronger component of maintenance of information, manipulation, and retrieval, which have been shown to be more domain-general, and consequently more susceptible to training transfer.

The practical contributions of this work are twofold. Firstly, it is the first time that a bilingual advantage is found in WM tasks in carefully matched large sample sizes of balanced and native young adult bilinguals immersed in a bilingual society. Secondly, it emphasizes the need of the methodical sample matching, since we found no bilingual advantage in EF when samples were matched for known confounding factors. Importantly, the majority of the subsamples that showed a significant bilingual advantage in the bootstrapping analysis co-occurred with differences in other sociodemographic factors.

Altogether, the different analysis conducted with the current data and the previous findings of the field lead us to conclude that the previous findings of a bilingual advantage in EF might have been a product of the uncontrolled non-linguistic characteristics of the cohorts of participants tested. Instead, bilinguals seem to benefit from their idiosyncratic language context in situations where an active use of the elements in the WM is needed, and it is in this context when they outperform their monolingual peers. We believe that the results shown here will help to reinterpret the theories behind the bilingual advantage theory and to narrow down the scope in future research to help identifying the critical factors that make the bilingual advantage to show up sometimes.


  1. 1. Bialystok E, Craik FIM, Klein R, Viswanathan M. Bilingualism, Aging, and Cognitive Control: Evidence From the Simon Task. Psychol Aging. 2004;19: 290–303. pmid:15222822
  2. 2. Miyake A, Friedman NP, Emerson MJ, Witzki AH, Howerter A, Wager TD. The Unity and Diversity of Executive Functions and Their Contributions to Complex “Frontal Lobe” Tasks: A Latent Variable Analysis. Cogn Psychol. 2000;41: 49–100. pmid:10945922
  3. 3. Miyake A, Friedman NP. The Nature and Organization of Individual Differences in Executive Functions: Four General Conclusions. Curr Dir Psychol Sci. NIH Public Access; 2012;21: 8–14. pmid:22773897
  4. 4. Lagrou E, Hartsuiker RJ, Duyck W. The influence of sentence context and accented speech on lexical access in second-language auditory word recognition. Biling Lang Cogn. Cambridge University Press; 2013;16: 508–517.
  5. 5. Midgley KJ, Holcomb PJ, VanHeuven WJB, Grainger J. An electrophysiological investigation of cross-language effects of orthographic neighborhood. Brain Res. NIH Public Access; 2008;1246: 123–35. pmid:18948089
  6. 6. Thierry G, Wu YJ. Brain potentials reveal unconscious translation during foreign-language comprehension. Proc Natl Acad Sci U S A. National Academy of Sciences; 2007;104: 12530–5. pmid:17630288
  7. 7. Green DW. Mental control of the bilingual lexico-semantic system. Biling Lang Cogn. 1998;1: 67. 10.1017/S1366728998000133
  8. 8. Bialystok E. Reshaping the mind: the benefits of bilingualism. Can J Exp Psychol. NIH Public Access; 2011;65: 229–35. pmid:21910523
  9. 9. Friedman NP, Miyake A, Young SE, Defries JC, Corley RP, Hewitt JK. Individual differences in executive functions are almost entirely genetic in origin. J Exp Psychol Gen. NIH Public Access; 2008;137: 201–25. pmid:18473654
  10. 10. Moreno S, Bialystok E, Barac R, Schellenberg EG, Cepeda NJ, Chau T. Short-term music training enhances verbal intelligence and executive function. Psychol Sci. NIH Public Access; 2011;22: 1425–33. pmid:21969312
  11. 11. Karbach J, Kray J. How useful is executive control training? Age differences in near and far transfer of task-switching training. Dev Sci. 2009;12: 978–990. pmid:19840052
  12. 12. Kray J, Lindenberger U. Adult age differences in task switching. Psychol Aging. 2000;15: 126–47. Available: pmid:10755295
  13. 13. Bialystok E. Cognitive Complexity and Attentional Control in the Bilingual Mind. Child Dev. Wiley/Blackwell (10.1111); 1999;70: 636–644.
  14. 14. Bialystok E, Martin MM. Attention and inhibition in bilingual children: evidence from the dimensional change card sort task. Dev Sci. 2004;7: 325–39. Available: pmid:15595373
  15. 15. Bialystok E, Craik FIM, Grady C, Chau W, Ishii R, Gunji A, et al. Effect of bilingualism on cognitive control in the Simon task: evidence from MEG. Neuroimage. 2005;24: 40–49. pmid:15588595
  16. 16. Eriksen BA, Eriksen CW. Effects of noise letters upon the identification of a target letter in a nonsearch task. Percept Psychophys. Springer-Verlag; 1974;16: 143–149.
  17. 17. Simon JR, Rudell AP. Auditory S-R compatibility: The effect of an irrelevant cue on information processing. J Appl Psychol. 1967;51: 300–304. pmid:6045637
  18. 18. Stroop JR. Studies of interference in serial verbal reactions. J Exp Psychol. 1935;18: 643–662.
  19. 19. Paap KR, Johnson HA, Sawi O. Are bilingual advantages dependent upon specific tasks or specific bilingual experiences? J Cogn Psychol. Routledge; 2014;26: 615–639.
  20. 20. Bialystok E, Craik F, Luk G. Cognitive control and lexical access in younger and older bilinguals. J Exp Psychol Learn Mem Cogn. 2008;34: 859–873. pmid:18605874
  21. 21. Bialystok E. Effect of bilingualism and computer video game experience on the Simon task. Can J Exp Psychol. 2006;60: 68–79. Available: pmid:16615719
  22. 22. Bialystok E, Martin MM, Viswanathan M. Bilingualism across the lifespan: The rise and fall of inhibitory control. Int J Biling. SAGE PublicationsSage UK: London, England; 2005;9: 103–119.
  23. 23. Costa A, Hernández M, Sebastián-Gallés N. Bilingualism aids conflict resolution: Evidence from the ANT task. Cognition. 2008;106: 59–86. pmid:17275801
  24. 24. Paap KR, Johnson HA, Sawi O. Bilingual advantages in executive functioning either do not exist or are restricted to very specific and undetermined circumstances. Cortex. 2015;69: 265–278. pmid:26048659
  25. 25. Duñabeitia JA, Carreiras M. The bilingual advantage: Acta est fabula? Cortex. 2015;73: 371–372. pmid:26189682
  26. 26. Martin-Rhee MM, Bialystok E. The development of two types of inhibitory control in monolingual and bilingual children. Biling Lang Cogn. Cambridge University Press; 2008;11: 81–93.
  27. 27. Costa A, Hernández M, Costa-Faidella J, Sebastián-Gallés N. On the bilingual advantage in conflict processing: Now you see it, now you don’t. Cognition. 2009;113: 135–149. pmid:19729156
  28. 28. Morton JB, Harper SN. What did Simon say? Revisiting the bilingual advantage. Dev Sci. 2007;10: 719–726. pmid:17973787
  29. 29. Hilchey MD, Klein RM. Are there bilingual advantages on nonlinguistic interference tasks? Implications for the plasticity of executive control processes. Psychon Bull Rev. 2011;18: 625–658. pmid:21674283
  30. 30. Paap KR, Greenberg ZI. There is no coherent evidence for a bilingual advantage in executive processing. Cogn Psychol. 2013;66: 232–258. pmid:23370226
  31. 31. Paap KR, Johnson HA, Sawi O. Should the search for bilingual advantages in executive functioning continue? Cortex. 2016;74: 305–314. pmid:26586100
  32. 32. Mezzacappa E. Alerting, Orienting, and Executive Attention: Developmental Properties and Sociodemographic Correlates in an Epidemiological Sample of Young, Urban Children. Child Dev. Wiley/Blackwell (10.1111); 2004;75: 1373–1386. pmid:15369520
  33. 33. Noble KG, Norman MF, Farah MJ. Neurocognitive correlates of socioeconomic status in kindergarten children. Dev Sci. 2005;8: 74–87. pmid:15647068
  34. 34. Sarsour K, Sheridan M, Jutte D, Nuru-Jeter A, Hinshaw S, Boyce WT. Family Socioeconomic Status and Child Executive Functions: The Roles of Language, Home Environment, and Single Parenthood. J Int Neuropsychol Soc. Cambridge University Press; 2011;17: 120–132. pmid:21073770
  35. 35. Crimmins EM, Soldo BJ, Ki Kim J, Alley DE. Using anthropometric indicators for Mexicans in the United States and Mexico to understand the selection of migrants and the “hispanic paradox.” Biodemography Soc Biol. Taylor & Francis Group; 2005;52: 164–177.
  36. 36. Thomson EF, Nuru-Jeter A, Richardson D, Raza F, Minkler M. The Hispanic Paradox and older adults’ disabilities: is there a healthy migrant effect? Int J Environ Res Public Health. Multidisciplinary Digital Publishing Institute (MDPI); 2013;10: 1786–814. pmid:23644828
  37. 37. Palloni A, Arias E. Paradox Lost: Explaining the Hispanic Adult Mortality Advantage. Demography. Springer-Verlag; 2004;41: 385–415. pmid:15461007
  38. 38. Ng E. The healthy immigrant effect and mortality rates. Heal reports. 2011;22: 25–9. Available:
  39. 39. Kreft D, Doblhammer G. Contextual and individual determinants of health among Aussiedler and native Germans. Health Place. 2012;18: 1046–1055. pmid:22784776
  40. 40. Strong K, Australian Institute of Health and Welfare. Health in rural and remote Australia: the first report of the Australian Institute of Health and Welfare on rural health [Internet]. Australian Institute of Health and Welfare; 1998. Available:
  41. 41. Milne BJ, Poulton R, Caspi A, Moffitt TE. Brain drain or OE? Characteristics of young New Zealanders who leave. N Z Med J. 2001;114: 450–3. Available: pmid:11700773
  42. 42. Kuhn R, Everett B, Silvey R. The Effects of Children’s Migration on Elderly Kin’s Health: A Counterfactual Approach. Demography. 2011;48: 183–209. pmid:21258887
  43. 43. Wadsworth M, Kuh D, Richards M, Hardy R. Cohort Profile: The 1946 National Birth Cohort (MRC National Survey of Health and Development). Int J Epidemiol. Oxford University Press; 2006;35: 49–54. pmid:16204333
  44. 44. Bialystok E, Shapero D. Ambiguous benefits: the effect of bilingualism on reversing ambiguous figures. Dev Sci. 2005;8: 595–604. pmid:16246250
  45. 45. Engel de Abreu PMJ, Cruz-Santos A, Tourinho CJ, Martin R, Bialystok E. Bilingualism Enriches the Poor. Psychol Sci. 2012;23: 1364–1371. pmid:23044796
  46. 46. Antón E, Duñabeitia JA, Estévez A, Hernández JA, Castillo A, Fuentes LJ, et al. Is there a bilingual advantage in the ANT task? Evidence from children. Front Psychol. 2014;5: 398. pmid:24847298
  47. 47. Duñabeitia JA, Hernández JA, Antón E, Macizo P, Estévez A, Fuentes LJ, et al. The Inhibitory Advantage in Bilingual Children Revisited. Exp Psychol. 2014;61: 234–251. pmid:24217139
  48. 48. Gathercole VCM, Thomas EM, Kennedy I, Prys C, Young N, Viñas Guasch N, et al. Does language dominance affect cognitive performance in bilinguals? Lifespan evidence from preschoolers through older adults on card sorting, Simon, and metalinguistic tasks. Front Psychol. Frontiers Media SA; 2014;5: 11. pmid:24550853
  49. 49. Antón E, Fernández García Y, Carreiras M, Duñabeitia JA. Does bilingualism shape inhibitory control in the elderly? J Mem Lang. Academic Press; 2016;90: 147–160.
  50. 50. de Bruin A, Bak TH, Della Sala S. Examining the effects of active versus inactive bilingualism on executive control in a carefully matched non-immigrant sample. J Mem Lang. Academic Press Inc.; 2015;85: 15–26.
  51. 51. Kirk NW, Fiala L, Scott-Brown KC, Kempe V. No evidence for reduced Simon cost in elderly bilinguals and bidialectals. J Cogn Psychol (Hove). Taylor & Francis; 2014;26: 640–648. pmid:25264481
  52. 52. Ramos S, Fern Andez García Y, Ant E, Casaponsa A, Andoni J, Nabeitia D~. Does learning a language in the elderly enhance switching ability? 2016;
  53. 53. Lehtonen M, Soveri A, Laine A, Järvenpää J, de Bruin A, Antfolk J. Is bilingualism associated with enhanced executive functioning in adults? A meta-analytic review. Psychol Bull. 2018;144: 394–425. pmid:29494195
  54. 54. Miyake A, Shah P. Models Of Working Memory Mechanisms Of Active Maintenance And Executive Control Edited By [Internet]. 1999. Available:
  55. 55. Baddeley AD, Hitch G. Working Memory. Psychol Learn Motiv. Academic Press; 1974;8: 47–89.
  56. 56. Klingberg T, Forssberg H, Westerberg H. Training of Working Memory in Children With ADHD. J Clin Exp Neuropsychol. 2002;24: 781–791. pmid:12424652
  57. 57. Klingberg T, Fernell E, Olesen PJ, Johnson M, Gustafsson P, Dahlström K, et al. Computerized Training of Working Memory in Children With ADHD-A Randomized, Controlled Trial. J Am Acad Child Adolesc Psychiatry. 2005;44: 177–186. pmid:15689731
  58. 58. Verhaeghen P, Cerella J, Basak C. A Working Memory Workout: How to Expand the Focus of Serial Attention From One to Four Items in 10 Hours or Less. J Exp Psychol Learn Mem Cogn. 2004;30: 1322–1337. pmid:15521807
  59. 59. Westerberg H, Jacobaeus H, Hirvikoski T, Clevberger P, Östensson M-L, Bartfai A, et al. Computerized working memory training after stroke–A pilot study. Brain Inj. 2007;21: 21–29. pmid:17364516
  60. 60. Redick TS, Shipstead Z, Harrison TL, Hicks KL, Fried DE, Hambrick DZ, et al. No evidence of intelligence improvement after working memory training: A randomized, placebo-controlled study. J Exp Psychol Gen. 2013;142: 359–379. pmid:22708717
  61. 61. Thorn ASC, Gathercole SE. Language-specific Knowledge and Short-term Memory in Bilingual and Non-bilingual Children. Q J Exp Psychol Sect A. 1999;52: 303–324. pmid:10371873
  62. 62. Morales J, Calvo A, Bialystok E. Working memory development in monolingual and bilingual children. J Exp Child Psychol. NIH Public Access; 2013;114: 187–202. pmid:23059128
  63. 63. Namazi M, Thordardottir E. A working memory, not bilingual advantage, in controlled attention. Int J Biling Educ Biling. Taylor & Francis Group; 2010;13: 597–616.
  64. 64. Engle RW. Working Memory Capacity as Executive Attention. Curr Dir Psychol Sci. SAGE PublicationsSage CA: Los Angeles, CA; 2002;11: 19–23.
  65. 65. Engle RW, Kane MJ. Executive Attention, Working Memory Capacity, And A Two-Factor Theory Of Cognitive Control [Internet]. 2004. Available:
  66. 66. Gathercole SE, Pickering SJ, Ambridge B, Wearing H. The Structure of Working Memory From 4 to 15 Years of Age. Dev Psychol. 2004;40: 177–190. pmid:14979759
  67. 67. Bialystok E. Bilingualism and the Development of Executive Function: The Role of Attention. Child Dev Perspect. Wiley/Blackwell (10.1111); 2015;9: 117–121. pmid:26019718
  68. 68. Paap KR, Sawi O. Bilingual advantages in executive functioning: problems in convergent validity, discriminant validity, and the identification of the theoretical constructs. Front Psychol. Frontiers Media SA; 2014;5: 962. pmid:25249988
  69. 69. Bialystok E. Global-local and trail-making tasks by monolingual and bilingual children: beyond inhibition. Dev Psychol. NIH Public Access; 2010;46: 93–105. pmid:20053009
  70. 70. Guido Mendes CM. The Impact of Bilingualism on Conflict Control. University of Otago; 2015; Available:
  71. 71. Petrides M, Milner B. Deficits on subject-ordered tasks after frontal- and temporal-lobe lesions in man. Neuropsychologia. 1982;20: 249–62. Available: pmid:7121793
  72. 72. CORSI PM. Human memory and the medial temporal region of the brain. Diss Abstr Int. 1972;34: 819B. Available:
  73. 73. Milner B. Interhemispheric differences in the localization of psychological processes in man. Br Med Bull. 1971;27: 272–7. Available: pmid:4937273
  74. 74. Luo L, Craik FIM, Moreno S, Bialystok E. Bilingualism interacts with domain in a working memory task: Evidence from aging. Psychol Aging. 2013;28: 28–34. pmid:23276212
  75. 75. Ratiu I, Azuma T. Working memory capacity: Is there a bilingual advantage? J Cogn Psychol. Psychology Press Ltd; 2015;27: 1–11.
  76. 76. Hansen LB, Macizo P, Duñabeitia JA, Saldaña D, Carreiras M, Fuentes LJ, et al. Emergent Bilingualism and Working Memory Development in School Aged Children. Lang Learn. 2016;66: 51–75.
  77. 77. Hansen LB, Morales J, Macizo P, Duñabeitia JA, Saldaña D, Carreiras M, et al. Reading comprehension and immersion schooling: evidence from component skills. Dev Sci. 2017;20: e12454. pmid:28032442
  78. 78. Ljungberg JK, Hansson P, Andrés P, Josefsson M, Nilsson L-G. A Longitudinal Study of Memory Advantages in Bilinguals. Bolhuis JJ, editor. PLoS One. Public Library of Science; 2013;8: e73029. pmid:24023803
  79. 79. Luo L, Luk G, Bialystok E. Effect of language proficiency and executive control on verbal fluency performance in bilinguals. Cognition. 2010;114: 29–41. pmid:19793584
  80. 80. Farrell Pagulayan K, Busch RM, Medina KL, Bartok JA, Krikorian R. Developmental Normative Data for the Corsi Block-Tapping Task. J Clin Exp Neuropsychol. 2006;28: 1043–1052. pmid:16822742
  81. 81. Baddeley A, Gathercole S, Papagno C. The phonological loop as a language learning device. Psychol Rev. 1998;105: 158–73. Available: pmid:9450375
  82. 82. de Bruin A, Carreiras M, Duñabeitia JA. The BEST Dataset of Language Proficiency. Front Psychol. Frontiers; 2017;8: 522. pmid:28428768
  83. 83. Kaufman AS. Kaufman brief intelligence test: KBIT. AGS, American Guidance Service Circle Pines, MN; 1990.
  84. 84. Izura C, Cuetos F, Brysbaert M. Lextale-Esp: A test to rapidly and efficiently assess the Spanish vocabulary size [Internet]. 2014. Available:
  85. 85. Besner D, Coltheart M. Ideographic and alphabetic processing in skilled reading of English. Neuropsychologia. 1979;17: 467–472. pmid:514483
  86. 86. Hebb D, Hebb DO. Distinctive features of learning in the higher animal [Internet]. 1961. Available:
  87. 87. Rouder JN, Speckman PL, Sun D, Morey RD, Iverson G. Bayesian t tests for accepting and rejecting the null hypothesis. Psychon Bull Rev. 2009;16: 225–237. pmid:19293088
  88. 88. Wetzels R, Matzke D, Lee MD, Rouder JN, Iverson GJ, Wagenmakers E-J. Statistical Evidence in Experimental Psychology. Perspect Psychol Sci. 2011;6: 291–298. pmid:26168519
  89. 89. Boersma P, Weenink D. Praat: doing phonetics by computer [computer program](2011). Version. 5: 74.
  90. 90. JASP Team. JASP (Version 0.9)[Computer software] [Internet]. 2018. Available:
  91. 91. Noble KG, McCandliss BD, Farah MJ. Socioeconomic gradients predict individual differences in neurocognitive abilities. Dev Sci. 2007;10: 464–480. pmid:17552936
  92. 92. Hackman DA, Farah MJ, Meaney MJ. Socioeconomic status and the brain: mechanistic insights from human and animal research. Nat Rev Neurosci. 2010;11: 651–659. pmid:20725096
  93. 93. Hackman DA, Farah MJ. Socioeconomic status and the developing brain. Trends Cogn Sci. 2009;13: 65–73. pmid:19135405
  94. 94. Hartanto A, Toh WX, Yang H. Bilingualism Narrows Socioeconomic Disparities in Executive Functions and Self-Regulatory Behaviors During Early Childhood: Evidence From the Early Childhood Longitudinal Study. Child Dev. 2018; pmid:29318589
  95. 95. Van der Linden L, Van de Putte E, Woumans E, Duyck W, Szmalec A. Does Extreme Language Control Training Improve Cognitive Control? A Comparison of Professional Interpreters, L2 Teachers and Monolinguals. Front Psychol. Frontiers; 2018;9: 1998. pmid:30405488
  96. 96. Hedge C, Powell G, Sumner P. The reliability paradox: Why robust cognitive tasks do not produce reliable individual differences. Behav Res Methods. 2018;50: 1166–1186. pmid:28726177
  97. 97. Friedman NP, Miyake A. The Relations Among Inhibition and Interference Control Functions: A Latent-Variable Analysis. J Exp Psychol Gen. 2004;133: 101–135. pmid:14979754
  98. 98. CALABRIA M, BRANZI FM, MARNE P, HERNÁNDEZ M, COSTA A. Age-related effects over bilingual language control and executive control. Biling Lang Cogn. Cambridge University Press; 2015;18: 65–78.
  99. 99. Calabria M, Hernández M, Branzi FM, Costa A. Qualitative Differences between Bilingual Language Control and Executive Control: Evidence from Task-Switching. Front Psychol. 2012;2: 399. pmid:22275905
  100. 100. De Baene W, Duyck W, Brass M, Carreiras M. Brain Circuit for Cognitive Control Is Shared by Task and Language Switching. J Cogn Neurosci. 2015;27: 1752–1765. pmid:25901448
  101. 101. Prior A, Gollan TH. The elusive link between language control and executive control: A case of limited transfer. J Cogn Psychol. 2013;25: 622–645. pmid:24688756
  102. 102. Branzi FM, Calabria M, Boscarino ML, Costa A. On the overlap between bilingual language control and domain-general executive control. Acta Psychol (Amst). 2016;166: 21–30. pmid:27043252
  103. 103. Magezi DA, Khateb A, Mouthon M, Spierer L, Annoni J-M. Cognitive control of language production in bilinguals involves a partly independent process within the domain-general cognitive control network: Evidence from task-switching and electrical brain activity. Brain Lang. 2012;122: 55–63. pmid:22575667
  104. 104. de Bruin A, Roelofs A, Dijkstra T, FitzPatrick I. Domain-general inhibition areas of the brain are involved in language switching: FMRI evidence from trilingual speakers. Neuroimage. 2014;90: 348–359. pmid:24384153
  105. 105. Blanco-Elorrieta E, Pylkkanen L. Bilingual Language Control in Perception versus Action: MEG Reveals Comprehension Control Mechanisms in Anterior Cingulate Cortex and Domain-General Control of Production in Dorsolateral Prefrontal Cortex. J Neurosci. 2016;36: 290–301. pmid:26758823
  106. 106. Cocchini G, Logie RH, Della Sala S, MacPherson SE, Baddeley AD. Concurrent performance of two memory tasks: Evidence for domain-specific working memory systems. Mem Cognit. Springer-Verlag; 2002;30: 1086–1095. pmid:12507373
  107. 107. Cowan N. Attention and Memory [Internet]. Oxford University Press; 1998.
  108. 108. Saults JS, Cowan N. A central capacity limit to the simultaneous storage of visual and auditory arrays in working memory. J Exp Psychol Gen. 2007;136: 663–684. pmid:17999578
  109. 109. Smith EE, Jonides J. Storage and executive processes in the frontal lobes. Science. 1999;283: 1657–61. Available: pmid:10073923
  110. 110. Courtney SM, Ungerleider LG, Keil K, Haxby J V. Object and spatial visual working memory activate separate neural systems in human cortex. Cereb Cortex. 1996;6: 39–49. Available: pmid:8670637
  111. 111. Ungerleider LG, Courtney SM, Haxby J V. A neural system for human visual working memory. Proc Natl Acad Sci U S A. 1998;95: 883–90. Available: pmid:9448255
  112. 112. Cowan N, Li D, Moffitt A, Becker TM, Martin EA, Saults JS, et al. A Neural Region of Abstract Working Memory. J Cogn Neurosci. 2011;23: 2852–2863. pmid:21261453
  113. 113. Chein JM, Moore AB, Conway ARA. Domain-general mechanisms of complex working memory span. Neuroimage. 2011;54: 550–559. pmid:20691275
  114. 114. Majerus S, D’Argembeau A, Martinez Perez T, Belayachi S, Van der Linden M, Collette F, et al. The Commonality of Neural Networks for Verbal and Visual Short-term Memory. J Cogn Neurosci. 2010;22: 2570–2593. pmid:19925207
  115. 115. Koelsch S, Schulze K, Sammler D, Fritz T, Müller K, Gruber O. Functional architecture of verbal and tonal working memory: An FMRI study. Hum Brain Mapp. 2009;30: 859–873. pmid:18330870
  116. 116. Li D, Christ SE, Cowan N. Domain-general and domain-specific functional networks in working memory. Neuroimage. 2014;102: 646–656. pmid:25178986
  117. 117. Blom E, Küntay AC, Messer M, Verhagen J, Leseman P. The benefits of being bilingual: Working memory in bilingual Turkish–Dutch children. J Exp Child Psychol. 2014;128: 105–119. pmid:25160938
  118. 118. BIALYSTOK E, LUK G, PEETS KF, YANG S. Receptive vocabulary differences in monolingual and bilingual children. Biling Lang Cogn. 2010;13: 525–531. pmid:25750580
  119. 119. Kaushanskaya M, Blumenfeld HK, Marian V. The relationship between vocabulary and short-term memory measures in monolingual and bilingual speakers. Int J Biling. SAGE PublicationsSage UK: London, England; 2011;15: 408–425. pmid:22518091
  120. 120. McCabe DP, Roediger HL, McDaniel MA, Balota DA, Hambrick DZ. The relationship between working memory capacity and executive functioning: Evidence for a common executive attention construct. Neuropsychology. 2010;24: 222–243. pmid:20230116
  121. 121. Salminen T, Strobach T, Schubert T. On the impacts of working memory training on executive functioning. Front Hum Neurosci. Frontiers Media SA; 2012;6: 166. pmid:22685428
  122. 122. Hartshorne JK, Germine LT. When does cognitive functioning peak? The asynchronous rise and fall of different cognitive abilities across the life span. Psychol Sci. NIH Public Access; 2015;26: 433–43. pmid:25770099
  123. 123. Bialystok E. Factors in the Growth of Linguistic Awareness. Child Dev. WileySociety for Research in Child Development; 1986;57: 498.
  124. 124. Bialystok E, Senman L. Executive Processes in Appearance-Reality Tasks: The Role of Inhibition of Attention and Symbolic Representation. Child Dev. 2004;75: 562–579. pmid:15056206