Computer-assisted instruction versus inquiry-based learning: The importance of working memory capacity

The Covid-19 pandemic has led millions of students worldwide to intensify their use of digital education. This massive change is not reflected by the scant scientific research on the effectiveness of methods relying on digital learning compared to other innovative and more popular methods involving face-to-face interactions. Here, we tested the effectiveness of computer-assisted instruction (CAI) in Science and Technology compared to inquiry-based learning (IBL), another modern method which, however, requires students to interact with each other in the classroom. Our research also considered socio-cognitive factors–working memory (WM), socioeconomic status (SES), and academic self-concept (ASC)–known to predict academic performance but usually ignored in research on IBL and CAI. Five hundred and nine middle-school students, a fairly high sample size compared with relevant studies, received either IBL or CAI for a period varying from four to ten weeks prior to the Covid-19 events. After controlling for students’ prior knowledge and socio-cognitive factors, multilevel modelling showed that CAI was more effective than IBL. Although CAI-related benefits were stable across students’ SES and ASC, they were particularly pronounced for those with higher WM capacity. While indicating the need to adapt CAI for students with poorer WM, these findings further justify the use of CAI both in normal times (without excluding other methods) and during pandemic episodes.


Introduction
The accelerating spread of Covid-19 has led the majority of countries across the globe to close their schools for varying lengths of time. If closing schools seems to be a logical decision in "teacher-led large group instruction" is a coarse definition of what goes on in the classroom and the transmission of knowledge may vary greatly depending on the teachers' characteristics or their preferred methods of instruction. A better comparison should include instruction based on more identifiable and well-defined practices for the domains under interest here, that is, Science and Technology. Inquiry-based learning. A particularly well-suited method for teaching scientific subjects is IBL. It refers to a self-learning method that follows practices similar to those of professional scientists in order to construct knowledge through self-directed investigations [21]. The famous philosopher John Dewey and the Physics Nobel Prize-winner Georges Charpak democratized IBL by promoting the application of scientific reasoning principles to education in order to provide students with the skills required in modern societies. Typically, the method consists in teachers supervising a group of students working collaboratively in face-to-face interactions and elaborating scientific concepts using hypothetico-deductive thinking [22,23]. When addressing a particular science topic, the IBL process is based on the active reproduction-by students themselves-of the fundamental steps of scientific reasoning: formulating hypotheses, designing an experiment, collecting results, interpreting them and drawing conclusions [24,25]. To do so, students work hands-on in dedicated classrooms, thus allowing them to interact with physical material, under the guidance of a trained teacher who ensures that the IBL reasoning phases are completed successfully.
The effectiveness of inquiry-based learning has long been a subject of debate. While the evidence from the early studies indicated positive effects of IBL compared to more conventional, teacher-led instruction [26,27], international reports have challenged this position. The 2015 PISA [28,29] report investigated achievements in science among students from primary to middle school across all countries and economies included in the project. Regarding the impact on IBL on academic performance, the PISA report showed that after statistically controlling for students' and schools' socioeconomic status (SES), IBL was negatively associated with students' performance in fifty-six countries. When the analysis was restricted to the OECD countries, IBL was positively associated with features other than performance, such as epistemological convictions and motivation to engage in a scientific career, even though these correlations were weaker than with direct instructional methods. Providing a fine analysis of PISA 2015 in England, Jerrim et al. [30] reached similar conclusions. The analysis showed that, unless associated with high levels of guidance, IBL had a very weak positive relationship with attainment in science, and that this small effect was robust regardless of the type of inquiry, test measure, varying levels of disciplinary climates in classrooms, gender, and prior attainment.
Indeed, guidance is an important variable that seems to account almost entirely for the presence of an IBL effect [25,31]. Applying a meta-analytic approach to 72 studies, Lazonder and Harmsen [25] distinguished among studies using different types of guidance and contrasted these studies with others using unguided IBL. They found that a minimal amount of guidance was needed for IBL to be effective, with an average effect size of .50 on learning outcomes in terms of learning skills and domain knowledge. The fact that the IBL size effect on learning skills was twice as high (.78) as that on domain knowledge (.37) is consistent with previous findings suggesting that IBL is particularly suited for improving epistemological thinking rather than memorizing factual content [29], and better suited for deep than surface learning [32]. Furthermore, increasing the degree of guidance (i.e., from mere supervision to full explanations) seemed to have little impact except on measures of performance during ongoing activities [25], suggesting that any type of guidance is sufficient to elicit an IBL effect on learning outcomes. However, it should be remembered that Jerrim et al. [30] found that IBL was effective only when coupled with high levels of guidance. Finally, Lazonder and Harmsen [25] showed that the positive effect of guided IBL was not specific to age, suggesting that children, teenagers and adolescents benefited equally from it.
Comparing the two methods. Although the popularity of IBL has risen more steeply than that of CAI in recent years (cf. Fig 1), this does not necessarily mean that IBL is more effective. Curiously, the two methods have never been compared directly in a single integrative study on identical objects of knowledge in the Science and Technology field. Our approach is oriented towards helping teachers to identify which methods, among available ones, are most efficient in inculcating knowledge and competences that reflect standard evaluation criteria from the national programme. It is worth noting that in the general curriculum, those criteria largely involve the acquisition of factual and conceptual knowledge more than meta-cognitive skills [33]. Therefore, in the absence new evaluation policies, a method's efficiency is here understood as its ability to address those standard criteria. In the context of the digital revolution, and given the considerable financial supports available for digital technologies in education (EdTechs) [34], it is critically important to determine whether CAI is beneficial compared to well-established and well-defined alternative forms of instruction such as IBL. The CAI vs IBL comparison is interesting as both methods a) are typically associated with active learning. In both methods, students play an active role in the learning process by engaging in problemsolving activities, an approach which requires more than just listening [35,36]. Furthermore, in both methods b) students benefit from highly interactive environments and c) work autonomously in a self-paced manner under the supervision of the teacher [17,37].
Conversely, CAI and IBL differ with regard to three main factors. These concern a) the use of digital technology for learning (absent in IBL) and b) the collaborative nature of IBL practices. In IBL, students indeed typically engage in peer to peer interactions [23], while in CAI, they typically interact with the computer agent [38]. Therefore, in IBL they can engage in debates and discussions, activities which are minimized in CAI. Another difference concerns c) the immersive properties and frequent use of private feedback in CAI which are known to provide high control over the task and keep students busy and motivated [39]. Feedback in CAI is delivered by the computer thus not visible by the peers. In contrast most feedback in IBL is public, especially those coming from peers. It should be noted that IBL has also been adapted for use in computer-assisted environments [40,41]. In the present study, however, we are interested in the very pragmatic and direct comparison of CAI vs IBL in their differing but conventional forms as applied to identical objects of knowledge. This comparison was prompted by the following simple and pragmatic question: which of the two methods, in their conventional form, is more effective for teaching identical topics in Science and Technology?
The role of working memory, academic self-concept and socioeconomic status. Not only has no direct comparison ever been made, but there is also a striking lack of documentation on how basic socio-cognitive individual differences-fundamental in education-may modulate the effectiveness of the two methods. The intrinsic navigational nature of CAI and its rich functionalities and content raise questions regarding the cognitive requirements that can support such a form of instruction, especially in terms of WM capacity [42,43]. WM is thought of as a flexible but limited mental capacity that permits the temporal maintenance and manipulation of information in an active state for ongoing processing [44]. It reflects "an ability to maintain information in the maelstrom of divergent thought" [45], where maintenance relates to the crucial ability to temporarily store information in memory while directing attention towards the stimuli that are relevant to our current goals (e.g., learning). WM is a strong predictor of general cognitive abilities [45,46] and academic achievement [47,48]. More precisely, WM is essential for supporting complex activities such as language, reading comprehension, problem-solving and reasoning [49,50].
Students with low WM capacity (i.e., hence, those who are less able to control their attention) might be particularly impacted by the diversity of content, materials and navigational features of CAI, which may distract them from their learning goals. The digital environments implemented in CAI indeed expose students to large amounts of information presented in different modalities and through hypertext links that may overload the cognitive system [51,52], especially in students with low WM capacity who may need to repeat the same action several times in order to understand specific pieces of information before moving on to the next step. As an illustration of the deleterious effects of navigational features, Scharinger et al. (2015) [53] found an additional cognitive load when reading involved having to navigate through hypertext links compared to pure reading.
For different reasons, IBL could be equally challenging for the attentional system. According to Cognitive Load theorists (e.g., [42,54]), classic IBL relies little on previous explicit exposure to content, thus preventing novice students from building a mental model of the material itself in long-term memory [55,56]. This may result in IBL instructional designs that increase cognitive load [55,57] and impair retention [58]. However, providing high levels of guidance in IBL reduces cognitive load [55] and the more complex the task is, the more guidance is required [57].
Additionally, the present study considered two other major socio-cognitive factors that may moderate the CAI/IBL effects, namely academic self-concept (ASC) and SES. Academic self-concept, the perception that students have about their own abilities compared with those of their classmates [59], constitutes one of the most relevant variables in the academic world because of its influence on motivation, learning and cognitive functioning [60]. As IBL requires students to work collectively, those with a low ASC may experience negative social comparisons with some of their classmates and lack the necessary confidence when reasoning in their presence [61]. This in turn may hamper their progression due to increased confusion or increased social withdrawal under IBL, a problem that may be reduced under CAI. This idea is supported by evidence showing that publicly drawing attention to the failures of students with low ASC-even without any intention to force a negative comparison with their peers-may cause new failures to arise [61]. An alternative hypothesis holds that collaborative work may be a means to improve self-efficacy [62], a construct close to ASC [63]. In particular, students who deliberately pay attention to peers who succeed in the task at hand are likely to increase their sense of self-efficacy [62,64].
Students' socioeconomic background may also make a difference. Those from privileged backgrounds may have access to more opportunities to explore sciences outside the school (e.g., family support, going to museums, having encyclopaedias and personal computers at home) [65], thus enhancing their knowledge and potentially giving them an advantage in both methods compared with their low-socioeconomic counterparts.

Research questions
The aim of the present study was to compare the two methods on identical Science and Technology topics taken from the official French national educational programme, while also focusing on socio-cognitive factors, WM in particular, as possible moderators of the effects of these instructional methods. More precisely, we sought to answer the following questions: 1. Is CAI more effective than IBL in learning similar topics in Science and Technology? 2. Do WM capacity, ASC and SES modulate the effects of these instructional methods?

Participants
An initial sample of 837 middle-school students participated in this study. Of the initial sample of students, 4.2% did not complete the academic tests in at least one of the disciplines, including Physics-Chemistry, Earth and Life Sciences, and Technology. Of the remaining 802 participants, 26.6% made errors on more than 50% of the secondary task of the WM task (see the "Working memory capacity" subsection of the Materials) and were therefore excluded from the analyses. We additionally excluded 5.8% of the remaining 589 students as they were identified as univariate outliers on WM performance on at least one of the two following criteria: interquartile range � 1.5 and Cook's distance. Of the remaining 555 participants, 8.3% did not complete all the items of all the ASC scales in the related disciplines. The final sample therefore consisted of 509 middle-school students (M age = 12.82, SD = 0.44; 272 females), which is quite large compared to experimental field studies in these areas [8,25]. All of the students were seventh graders, 282 took all three courses (55%), 97 took two courses (19%), and the remaining 130 took only one course (26%). Within the final sample size (N = 509), 46% were categorized as privileged students and 54% were disadvantaged students according to the nomenclature of professions and socio-professional categories of the French Ministry of Education. Threehundred-and-twenty-eight students (64%) received IBL instruction and the remaining onehundred-and-eighty-one students (36%) were taught using CAI. Since our research project took place in authentic school settings, the level of participation of particular schools and referent teachers determined the enrolment of individuals or groups of students in one or more topics.

Ethics statement
The study is part of a larger research project which received an approval from the Clermont Auvergne University Ethics Committee (number IRBO0011540-2018-08) in conformity with the French law on bioethics (covering Psychology). All participants' parents received a written informed consent form several weeks before the study that they had to read and sign to allow their child to participate.

Lesson plan and implementation
Computer-assisted instruction implementation. The CAI versions used for each subject and topic ("mass and volume" in Physics-Chemistry, "climate" in Earth and Life Sciences, and "material structure" in Technology) came from recent versions of a tool (Tactileo©) developed by Maskott©. These more sophisticated and dynamic versions were the product of a collaborative project in which programmers integrated the material content provided by teachers while parametrizing the CAI in accordance with teachers' and researchers' recommendations. For this study, we adopted the idea that technologies created or adapted by research teams and including teachers are more efficient for learning than those either taken from the commercial market or that simply use the technology as a delivery system [66]. For each topic, secondary education teachers, school inspectors, programmers and designers collaboratively transcribed the knowledge content of specific topics taken from the official French national educational programme into the system. The selection of the knowledge content was decided on and supervised by state school inspectors representative of the three disciplines involved. The knowledge content was adapted to the CAI architecture by means of a variety of pedagogical training methods that are typically reported in the CAI literature [13,14,17], including problem solving exercises and tutoring modes embedded in narrative scenarios (see S1 and S2 Appendices in S1 File). The content was displayed through a variety of materials (texts, videos, audios). The teachers' role was to introduce the topic and then to let their students learn on their own by interacting with the CAI and only intervene in the case of problems or questions from students. Students were instructed to avoid collaboration with other students in order to maximize the time spent at their computer. However, if students spontaneously interacted with their classmates, the teachers did not stop them from doing so as long as the exchange was brief. The teachers therefore encouraged their students to interact primarily with the computer. Each student assigned to the CAI condition was equipped with a digital tablet and interacted with the CAI in their usual classrooms in the presence of their usual subject teacher.
Inquiry-based learning implementation. All teachers in the IBL condition had been trained in this method by experts from a national foundation dedicated to IBL, represented locally by the "House for Science in Auvergne" (Maison Pour la Science en Auvergne). This training was a prerequisite for teachers to be involved in the IBL condition and guaranteed that they met the national standards for good IBL practices, including being able to give an appropriate level of guidance to students (c.f. [30]). The underlying content knowledge in the IBL condition was identical to that of the CAI condition and adapted to the instructional method by expert IBL teachers under the supervision of state school inspectors. In the same way as in the CAI condition, teachers in the IBL condition introduced the topic and provided general guidelines on how to reason scientifically about the topics under study. Students worked in their usual classrooms and were instructed by their teachers to learn on their own by collaborating with other students (in groups of 3-4 students) according to the IBL guidelines of the House for Science in Auvergne. Teachers intervened in the case of problems or questions from students, while also ensuring that the different phases of IBL reasoning (i.e., formulating hypotheses, designing an experiment, collecting results, interpreting them and drawing conclusions) proceeded correctly.
Intervention duration. In both the IBL and CAI conditions, students were exposed to topics related to Physics-Chemistry, Earth and Life Sciences, and Technology (see Fig 2) for a period varying from four to ten weeks. More precisely, the exposure duration to CAI and IBL differed across disciplines depending on the usual amount of time devoted to each topic in the French national educational programme. The CAI and IBL interventions lasted four weeks for Technology (focusing on material structure), six weeks for Earth and Life Sciences (climate), and ten weeks for Physics-Chemistry (mass and volume).

Materials
Academic performance. Test of prior knowledge (T0) before intervention. Measures of prior knowledge were taken in order to determine the efficacy of CAI on academic performance in Physics-Chemistry, Earth and Life Science and Technology. These measures assessed contents from the national educational programme acquired during the previous year and were used to provide a performance baseline intended to control for initial individual differences in performance as well as to examine potential interactions with instructional methods. The tests of prior knowledge consisted in short-answer questions and multiple-choice questions focusing on relevant topics from the national educational programme for each subject. Students took the T0 at least two weeks before the experimental interventions. Both to maximize statistical power and to standardize test metrics, the three T0 scores were centred, averaged and scaled to form composite Science and Technology scores ranging from 0 to 20 points.
Tests of knowledge (T1) after intervention. As for T0, T1 measures also consisted in shortanswer and multiple-choice questions focusing on relevant topics of the national educational programme for each subject (for an example of a T1 knowledge test, see S3 Appendix in S1 File). In contrast with T0, T1 measures assessed contents that were taught during the current curriculum year (seventh grade) via participation in one of the two instructional methods (IBL or CAI). The T1 tests included a mixture of factual knowledge and learning skills in accordance with the requirements of the French government (see S4 Appendix in S1 File), which was represented by state school inspectors who actively collaborated in this study. In Physics, for example, the students had to learn to differentiate the notions of mass and volume, to understand which instruments are used to measure one or the other, and their conditions of use. The "paths" for this learning were therefore different depending on whether the students https://doi.org/10.1371/journal.pone.0259664.g002 were exposed to the CAI or IBL method. However, the final test was composed of questions and exercises corresponding to the common denominator of the knowledge and skills that each student could, in principle, acquire with these two methods. Students took T1 approximately two weeks after the intervention. Again, the three T1 scores were merged into a single Science and Technology score ranging from 0 to 20 points.
Socio-cognitive assessments. Working memory capacity. This continuous variable was the WM performance score on the Operation span task adapted from [67] and available online at http://englelab.gatech.edu/taskdownloads. The Operation span task is a computer-based task consisting of lists of to-be-remembered (TBR) items interspersed with to-be-processed (TBP) items. Participants had to memorize lists of TBR items while processing the items in the secondary task and to recall the lists of TBR items at the end of each trial. The TBR items were letters and the secondary task was an arithmetic operation judgment task. For each TBP item, participants had to click on "yes" or "no" response buttons to determine whether the current item was correct or incorrect among an equal number of correct and incorrect items. At the end of a trial, a response screen invited participants to recall the TBR items in serial order by clicking on the right items presented among a number of distracters and then press an "enter" button to validate the response. The WM score corresponded to the average proportion of memory items (i.e., consonants) that were correctly recalled in serial order for lists of 4, 5 and 6 consonants.
Academic self-concept. A 6-point Likert-type scale [68] was adapted from the French translation by Huguet et al. (2009) [59]. We modified Huguet et al.'s version developed for French and Mathematics to assess self-concept in Physics-Chemistry, Earth and Life Sciences, and Technology. All three versions showed very good reliability (Cronbach's alphas > .80). Academic self-concept scales were tailored to each discipline; therefore, if students were taking more than one course, their ASC in related disciplines were averaged.
Socioeconomic status. We used the nomenclature of professions and socio-professional categories published by the French National Institute for Statistical and Economic Studies [69]. We collapsed the original four-category indicator (i.e., disadvantaged, medium, privileged to highly privileged backgrounds) into two categories (i.e., low and high) in order to simplify the statistical analyses.

Data collection
All these data (academic performance and socio-cognitive assessments) were collected online via a dedicated platform built for the purpose of the study. Students completed the tests and questionnaires directly on the platform, which was made accessible from the school computer lab. Data collection was supervised by national education personnel trained for the purpose of the study. During data collection, each class was divided into two groups to guarantee a sufficient number of computers per student and minimize noise.
Data collection spanned a maximum of thirty-seven weeks and varied depending on each Science and Technology discipline. At the beginning of the school year, all students completed the psychological assessments over a period of four weeks. Several weeks later (two to four weeks for Technology, nine to eleven weeks for Earth and Life Sciences and thirteen to-fifteen weeks for Physics-Chemistry), the academic pre-tests were administered. Two weeks later, the intervention was deployed for four to ten weeks (see Intervention duration). Finally, two weeks after each intervention, the academic post-test was administered.

Data analyses
We applied multilevel random intercept models to the three-level structure of the data (509 students in 48 classes in 11 schools). Random attribution to experimental conditions (i.e., IBL, CAI) was simply not feasible here since an optimum CAI approach depended on essential equipment-related conditions, such as an adequate internet connection or a sufficient number of modern computers per student, that not all schools could fulfil. This is a typical field constraint found in many large-scale studies [70]. To overcome this constraint, we followed up-todate recommendations [71,72] and performed multilevel modelling, while carefully controlling a range of parameters in order to increase validity by reducing estimation bias, as described below. We conducted all the statistical analyses with R software version 4.0.1 [73] and used the CAR [74] and lme4 [75] packages for the preliminary analyses and subsequent multilevel models, respectively.

Preliminary analyses
We conducted preliminary analyses of pre-test imbalance (Fig 3) for students' prior knowledge, SES, WM and ASC to ensure that students exposed to IBL and CAI had similar sociocognitive characteristics. These analyses resulted in non-significant group differences for prior knowledge, F(1, 507) = 1.31, p = .25, SES, χ 2 (1) = 3.02, p = .08, and WM, F < 1. Only ASC showed an imbalance at pre-test, F(1, 507) = 17.09, p < .001; however, as the effect size was very small (η 2 = .03), it was easily dealt with in the subsequent statistical procedure by fixing all covariates at their grand mean [71,72]. Prior knowledge and socio-cognitive factors were entered as covariates and fixed at their grand mean in subsequent multilevel analyses. This procedure allowed us to obtain bias-free estimates of the effects of the instructional methods on learning outcomes.

Multilevel models
Out of a series of models testing either main effects of instructional methods, prior knowledge and socio-cognitive factors, or moderation effects in addition to main effects, two models fitted the data equally well (see Table 1). Of the two models (Models 1 and 4), Model 1 comprised main effects of instructional method, prior achievement, and socio-cognitive variables, whereas Model 4 additionally comprised a moderation effect of instructional method due to WM capacity. As both models showed equal utility, we focused on the more explanative, interaction model (Model 4). The percentages of variance (Intra-Class Correlation coefficients, ICCs) explained by schools and classes out to the total variance were negligibly small (ICCschools = 0.5% and ICC classes = 0%) and are therefore not depicted. Importantly, as shown in Fig  4, prior knowledge and each socio-cognitive variable independently contributed to learning outcomes. Values of the regression coefficients from Model 4 for main effects of prior knowledge, SES, WM and ASC are shown in Table 1 Is computer-assisted instruction more effective than inquiry-based learning in learning similar topics in Science and Technology?  Table 1. The results revealed that students who received CAI significantly outperformed students who received IBL in Science and Technology by 1.38 points (out to 20). This roughly corresponds to an improvement from the 50 th to the 68 th percentile. In other words, average students without any special commendation would be eligible for a cum laude distinction if instructed with CAI as opposed to IBL. Regarding the prevention of school failure, these results indicate that about 6% of students receiving IBL who failed on national evaluations would have succeeded with CAI. Six percent of seventh graders still represents a population of nearly 50,000 students in France [76] and 300,000 students in the US [77].  Table 1. A one-standard deviation gain in WM capacity resulted in a supplementary CAI benefit of 0.57 points (out to 20) regardless of students' prior knowledge, SES, and ASC. This additional benefit corresponds to an approximate improvement from the 68 th to the 78 th percentile. This means that among students who failed on national evaluations, an additional 2% would have succeeded if taught with CAI instead of IBL, thanks to their higher WM capacity of about one standard deviation above average. Transposed to the general population of seventh graders, this proportion would represent 15,000 and 90,000 students in France [76] and in the US [77], respectively.

Discussion
The PISA 2015 [29] survey (i.e. a survey conducted every 3 years with a sample of more than 500,000 middle-school students in 72 participating countries) reported no link between financial investments in information and communication technologies for education and students' results on standardized tests, a finding which has revived the debate about the effectiveness of digital devices in education [10]. However, the PISA (2015) report is based on non-

Fig 5. Adjusted mean Science and Technology scores (Model 4) in inquiry-based learning and computer-assisted instruction (left panel) and as a function of students' working memory capacity (right panel).
https://doi.org/10.1371/journal.pone.0259664.g005 experimental and cross-sectional data. While the correlational analyses reported in the PISA surveys may question the usefulness of digital practices, they are not sufficient to invalidate the relevance of such practices in education. Furthermore, digital education represents an umbrella term for very different methods where sophisticated tutoring approaches such as CAI are lumped together with less sophisticated ones that merely deliver content, which limitations in terms of effectiveness have been put forward by the Covid-19 events [5].

Computer-assisted instruction was more effective than inquiry-based learning on students' performances in Science and Technology
In the present comparative experimental approach, our findings suggest that CAI generally outperforms IBL in Science and Technology, with the benefits being greater for students with higher WM capacity. Although it is beyond the scope of this article to determine all the characteristics responsible for the better performance observed in CAI, there may be several explanations. First, the highly-structured nature of CAI, supported by a variety of training methods, helps keep students engaged in the learning process and avoid off-task behaviours [78]. Conversely, research has shown that collaborative work (such as in IBL) is prone to off-task behaviours [79] and may lead to great variability in within-group individual contributions to the task [80] both having negative consequences on learning outcomes [81]. Second, and more importantly, the tutoring modes in CAI ensure structured support adapted to each student's learning pace [82]. This entails more regular feedback than in IBL simply because teachers working with large classes have limited time and attention, making it difficult to support each student individually [83]. Although student peers may provide some degree of feedback during IBL, the quality of this feedback may not be as valid and reliable as a teacher's expert feedback [84]. Third, the narrative scenarios in CAI may produce more contextualized representations in long-term memory and foster meaning attribution [85,86] although this element is less likely to make a difference as the IBL environment is highly contextualized too and additionally provides physical interaction with the real word, which is known to enhance memory retention [87,88].
The finding that CAI outperforms IBL is an important one since IBL is considered a gold standard method for teaching Science and Technology at school. In these domains, in which reasoning, planning and finding solutions through face-to-face collaborative interactions-all of which are central to IBL-are the rule rather than the exception, our results are counter-intuitive. Our findings do not mean that these highly desirable practices are not efficient. Instead, they mean that allowing students to reason and plan alone with CAI may also be a valuable option at certain points in students' learning trajectories, meaning that teachers should be able to include CAI in their repertoires along with other alternatives. An optimally balanced combination between different instruction methods may follow a dynamic adjustment to match individual needs and temporary states during the learning process. For example, performing CAI prior to IBL might help a student gain more confidence before being confronted with others' opinions while the opposite might give another student the opportunity to reflect on previous actions and discussions when receiving feedback from the digital tutor agent.
Another lesson learned from our data is that CAI proved superior regardless of student's ASC and SES, indicating that many students may indeed derive the same benefits from it. This lesson is more encouraging than the correlational reanalyses reported in the PISA 2015 data [89] which suggest that increasing the use of digital technologies for educational purposes among students who are less likely to use these technologies benefits only students of medium and high SES. By focusing specifically on the effectiveness of CAI by means of an experimental approach, we challenge this position by suggesting that CAI may help bring about greater equality of learning opportunities among students from different socioeconomic backgrounds.
Likewise, the fact that CAI was of equal benefit to students with high and low ASC echoes what has been reported experimentally in undergraduate students receiving either face-to-face or online instructions [90].
The benefit of computer-assisted instruction was higher for students with a higher working memory capacity Interestingly, we found a greater benefit of CAI in students with higher WM capacity. Given that the content was identical across the CAI and IBL conditions, we interpret the observed effect as being a consequence of the instructional design. Students' WM might be overloaded by the complexity of the CAI environment, with the result that students who are better able to overcome this difficulty benefit more. In particular, the navigational nature of CAI, including hypertext links and tools that enhance student's autonomy, might distract attention from essential information that will not be properly assimilated by the attentional system [43]. The navigational demands of CAI particularly affect the extraneous component of cognitive load, that is, the complexity of task-irrelevant material associated with the way information is presented (cf. the instructional design), as opposed to intrinsic load, that is to say the complexity of the information itself [42,91].
Another possibility is that IBL could have made greater demands on WM by increasing the cognitive load [55]. However, this account is not supported by the significant Instruction Method � WM capacity interaction effect, which indicates that CAI imposes greater WM demands. There are two possible explanations for this. First, the IBL teachers in our study were trained to meet the national standards for IBL. These include an appropriate level of guidance [25,30], which reduces the cognitive load [55,57] and therefore also the WM demands of the task. The second explanation is based on the collaborative nature of IBL and transactive memory. Transactive memory refers to a collective mechanism through which a group develops a memory system that distributes information across partners [92]. In collaborative tasks, the interaction between partners seeking to achieve a common goal often results in a specialized division of labour where the different partners adopt specific roles in the task [92,93]. In a first encoding phase, the partners' roles are defined [94], for example, different members perform the different scientific steps involved in IBL. During a storage phase, the members store the information specific to their roles, thus retaining as opposed to sharing different information [92,95]. During a retrieval phase, the members combine the different sources of information that have been encoded and stored within the framework of their respective roles. Consequently, each partner works as a memory aid for the others, leading to a collaborative memory system that exceeds the capacity of each individual member [92,93]. As retention is distributed across partners, the cognitive load for each individual, and thus the WM demands of the IBL instructional design for each individual learner, may be reduced. Given that students work individually in CAI, WM demands are higher since each student must memorize the content on their own, making the contribution of WM more visible.
To help reduce school failure through AI -during both normal and troubled times-our findings suggest that one important aspect requiring attention is the consequences of CAI use for students with below-average WM capacity, for whom CAI brought no benefit (see Fig 2, right panel, students with -2 standard deviations from the mean WM measure). This suggests that the conditions of use of CAI should be adapted for these students, who are more likely to exhibit attentional problems. In line with previous recommendations, one solution may be to shorten or sequence the CAI session (e.g., 15-min sessions, 3 times per day) in order to relieve attention and memory load [96]. Fortunately, this objective could easily be achieved since CAI is highly flexible, individualized and remains accessible outside of school. Furthermore, adaptations could be made based on the instructional design of CAI according to each student's WM capacity. To reduce the cognitive load for all students and further boost the benefits of CAI, our CAI condition should have carefully considered the split attention effect [42,97]. When synchronized in a way that maximizes multimodality overlap, the use of different media modalities helps focus students' attention, for example through the complementary and simultaneous inclusion of audio and visual sources [42,97]. For students with lower WM capacity, decreasing the number of elements presented on screen in the light of individual capacity may reduce the observed WM effect [93,98]. However, considering that students with higher WM capacity can process larger amounts of information, thus deriving more benefit from CAI, we may still expect the achievement gap between low and high WM students to increase even with more WM-adaptive CAI technology. This can be viewed as an extension to CAI of the Matthew effect, a framework describing how children with various minor advantages in reading (and other abilities) progress faster and draw away from their less advantaged peers, thus steadily increasing the achievement gap throughout the schooling process [94,96].

Implications
By indicating that, in normal times, CAI may be more efficient than the well-established method used for Science and Technology (IBL), our findings further legitimize CAI as a way of helping to prevent the disastrous consequences of pandemics on academic learning. However, as also indicated by our data, the benefits of CAI do not occur whatever students' working memory capacity. The interaction found here between CAI and WM gives us reason to doubt the commercial claims which have multiplied in the absence of solid scientific data since the start of the pandemic, suggesting, for example, that e.learning increases retention rates by 25% to 60%. As our results indicate, even with digital technologies accessible to all (regardless of students' SES), their educational effectiveness is not necessarily guaranteed, as their benefits for learning may depend on factors such as students' WM. Failure to take this into account would be to condemn ourselves to an "e.learning illusion" liable to aggravate rather than improve the situation of many students around the globe. In addition to the learning losses characterizing many students during the school holidays (roughly one month of learning on average [99]), students with lower WM capacity would be penalized by inappropriate digital education. There are areas in which this new approach can be implemented successfully, but it is also necessary to be aware that the educational use of CAI and digital technologies in general may have to be nuanced by students' cognitive characteristics [100].
This research effort, which must be conducted in parallel with the search for Covid-19 vaccines and treatments, is essential if we are to assess the value of CAIs and e.learning in general and not only on the basis of their frequency of use by teachers and students. Likewise, it is essential that the communities concerned (teachers, students and their families, policy makers) discuss their experiences in this area in order not only to try to optimize them but also to identify and/or enrich the most relevant avenues of research and to avoid sterile and possibly also dangerous slogans (as can also be the case with a hastily produced vaccine). Despite the urgency of dealing with the current pandemic, our results therefore suggest that we should not be scared of devoting scientific research to identifying the strengths and weaknesses of the uses of digital technologies and of the currently available services and applications. This is all the more important given that pandemics appear to have been increasing in frequency over the last few decades, and that the adoption of online learning may persist post-pandemic and thus be used more intensively than before. If it has to happen, we stress that policy makers should pay particular attention to the implementation of tutoring techniques in distant learning in addition to the provision of content. In normal times, however, this argument should not be taken as in favor for a "all digitalized" education system, but rather, in favor of a diversification of methods to better address students' heterogeneity.

Limitations
Some limitations should be pointed out. First, we were not able to determine exactly which characteristics of CAI specifically tap into WM. Future research is needed to provide clear indications about which features (e.g., navigational constraints, diversity of functionalities) tap into WM and may be further adapted for students with lower WM capacity (in addition to the recommendations provided here on CAI). Second, this study was conducted with middleschool students, meaning that the benefits of CAI might have been underestimated. As shown in previous work, while the effectiveness of IBL is stable across ages [25], CAI effects increase with school grades, with the largest effect sizes being found in postsecondary education [8]. Third, we were limited by the fact that the variety of topics studied here were analysed together as a Science and Technology score. Indeed, although our sample size was large enough to conduct broad-yet robust-analyses, thereby increasing generalizability as in the case of a metaanalysis, it lost in specificity due to the use of smaller and more heterogeneous samples in separate school subjects and experimental conditions, precluding fine-grained-but still robustanalyses. This heterogeneity was directly linked to field constraints and, while we acknowledge that ecological settings do not offer ideal conditions, only this complementary approach can provide a bridge between the laboratory and the real world. A fourth limitation of our study stems for its lack of direct assessment of students' interest in school subjects and topics. For example, Maltese and Tai (2011) [101] have stressed the importance of students' early interest in science topics for predicting their enrolment in college science and mathematics courses, with 65% of the students declaring that their interest started before middle school. This is especially important given that IBL is known to enhance students' motivation to learn and interest in scientific topics [102]. A direct assessment of interest in future studies may be useful in order to gain insights into whether CAI or IBL may differentially benefit students with low interest in science subjects. More generally, a better discrimination of knowledge and skills components in our knowledge tests would have shed light on the differentiated and complementary nature of benefits that CAI and IBL produce on factual knowledge and epistemological thinking [25,29], which would further legitimate a combined use of these methods. Last but not least, a fifth limitation relates to the lack of a collaborative version of CAI (using CAI in combination with peer-to-peer interactions), which could have clarified our comparison, controlling for the effect of collaboration per se, although the present results do not suggest a particular benefit associated with collaborative work.

Conclusion
The present study showed the potential of CAI to improve academic performance in Science and Technology compared to IBL, a well-established and popular method of instruction. Compared to IBL, the benefits of CAI were stable across students' ASC and SES, while being higher for students with higher WM capacity. Despite the overall benefit of CAI, our results suggest that special attention should be paid to the WM demands of CAI, which might require adaptations to the instructional design for students with lower WM capacity.