computers ⊠are used every day by language students and teachers as an integral part of every lesson, like a pen or a book ⊠without fear or inhibition, and equally without an exaggerated respect for what they can do. They will not be the centre of any lesson, but they will play a part in almost all. They will be completely integrated into all other aspects of classroom life, alongside coursebooks, teachers and notepads. They will go almost unnoticed.
(Bax, 2003, p. 23)
In fact, a 2016 paper by Stickler and Shi outlining the 12 most-cited CALL-focused papers between 2004 and 2015 in the journal System suggests
The âclassroomâ of the future might not look anything like classrooms even as recently as 2003: mobile learning (MALL) seems to be an ongoing and unstoppable trend; an increasing recognition of multilingualism as the norm might influence the way we use online resources; and the democratisation of research might show effects in the language classrooms of the future, as well, with teachers going well beyond the methods of action research and learners becoming involved in the production of research rather than being regarded as passive âparticipantsâ or âsubjectsâ.
(Stickler & Shi, 2016, p. 125, emphasis mine)
The idea that language learners may be able to use technology to play a more active role in their own language learning process is one that has not escaped applied linguists, and it is notable that one of the 12 most-cited papers in the Stickler and Shi article focuses on the use of corpora for language learning in an L2 writing context (Gaskell & Cobb, 2004). Ever since the early pioneers of corpus linguistics noted that linguistic knowledge can be construed as knowledge of patterns (e.g. Sinclairâs 1991idiom principle) gained through our encounters with these patterns in use (e.g. Hoeyâs 2005lexical priming), the use of corpora to extract these patterns â and to use corpus data to teach them â has been a major research area in applied linguistics over the past 25 years. Studies have focused on the use of corpus data presented to language students by their teachers or â increasingly â situations where the learners themselves develop the autonomy to discover the complexities of language through their own corpus consultation. These approaches, coined data-driven learning (DDL, Johns, 1990), have now been featured in over 200 empirical studies between 1989 and 2014 across a range of languages (Boulton & Cobb, 2017), and given the upwards trajectory in DDL studies since Gaskell and Cobb (2004), many more studies have recently been published or are in progress.
DDL is a pedagogical approach where direct learner engagement with corpus data in the form of either printed materials or learner-led hands-on corpus consultation using corpus tools allows for students to learn and internalise statistical and contextual information about language in use. Under a DDL approach, learners consult corpus data as âlanguage detectivesâ, with âevery student a Sherlock Holmesâ (Johns, 1997:101) as he or she queries, manipulates, and visualises a range of output data including concordances of query words with surrounding context, statistical information in the form of frequency or collocation lists, and increasingly visual or multimodal forms of data (see Hirata, this volume). DDL has often been described as an âinductiveâ approach to learning, where learnersâ active engagement with corpus data leads to increased focus-on-form (Long, 1991) and data-enhanced ânoticingâ of language features (Schmidt, 1990) that replaces the need to memorise abstract textbook ârulesâ while promoting a range of constructivist skills correlated with improved learning practices (e.g. Cobb, 1999).
The main reason for the sharp increase in DDL studies published over the last 25 years is that DDL works, and people are starting to take notice. Aside from the aforementioned cognitive benefits, DDL works because it represents the digital age we now find ourselves in, a world where information on any topic is available to learners at the touch of a button and where vast language resources are accessible and queryable in increasing numbers as corpus technology continues to improve. The more that language learners take charge of their own learning outside the use of contrived textbook examples â with this charge greatly facilitated by innovation corpus technologies â the more they are developing the skills for deductive reasoning, problem-solving, and autonomous learning so essential to the development of the modern twenty-first-century language learner. And the evidence is in â in an already much-cited meta-analysis of the effectiveness of DDL (as measured by DDLâs ability to increase learnersâ skills or knowledge) across a range of learning goals and contexts, large positive effect sizes have been found (Boulton & Cobb, 2017). Lee, Warschauer, and Lee (2018)âs meta-analysis of the effectiveness of DDL for vocabulary learning shows similar positive results, with a range of other studies reporting positively on DDLâs affordances for error correction (e.g. Crosthwaite, 2017), phraseology (e.g. Geluso and Yamaguchi, 2014), disciplinary-specific register and lexis (e.g. Crosthwaite, Wong, & Cheung, 2019), translation and interpreting (e.g. Sotelo, 2015), and many other of the âmultiple affordancesâ (LeĆko-SzymaĆska & Boulton, 2015) of language corpora for data-driven learning. Many of the contributors to this very volume have also experienced firsthand the effectiveness of DDL in their own contexts, disseminating empirical findings of data-driven learning in numerous publications while changing the learning habits of hundreds of students as they âspread the wordâ (Römer, 2009) as missionaries of data-driven learning.
So far, so good â so whatâs the problem? The problem is that if you ask anyone under the age of 18 or anyone responsible for teaching those under 18 what a corpus is, the chances are that you will get a blank look and a shrug of the shoulders. Put simply, while DDL is rapidly becoming established as a viable pedagogical approach in tertiary education, there is a real dearth of research into the affordances of DDL with pre-tertiary learners. Boulton and Cobb (2017) reported that only ten out of the 88 samples included in their meta-analysis of DDL studies involved secondary school students; six samples included between-subject designs, and four included within-subject designs. PĂ©rez-Paredes (forthcoming) reported that only two out of the 32 studies exploring DDL or corpora in language education during the 2011â2015 period involved secondary school learners. Lee et al. (2018)âs meta-analysis did not include the age of the learners as a variable at all. Boulton and Cobb (2017) concluded that âunfortunately, there is little [DDL] research with high school learnersâ (p. 375), while the amount of DDL-focused research conducted with primary school learners can probably be counted on one hand (see Crosthwaite & Stell; Hirata, this volume).
Why have pre-tertiary learners been left in the cold when it comes to DDL thus far? We know of the ârational fearsâ (Boulton, 2009) affecting uptake of DDL generally, including the fears of learners regarding a switch to self-guided from teacher-led learning, the complexity of corpus consultation, and difficulties understanding corpus output; there are also fears of teachers, who feel their students would not be able to handle DDL or who may be distrustful of corpus resources (see also Schaeffer-Lacroix, this volume). Yet, despite ICT now being a core part of both primary and secondary school curricula as a twenty-first-century skill (Voogt, Knezek, Cox, Knezek, & ten Brummelhuis, 2013) and despite most international primary and secondary school assessments of digital literacy measuring skills such as âsearching, retrieving, and evaluating digital informationâ (Siddiq, Hatlevik, Olsen, Throndsen, & Scherer, 2016, p. 58), the integration of corpus technology within the pre-tertiary classroom has not yet really begun in earnest. Eng (2005) describes three phases required for such integration, with the first (emerging) phase dealing with ICT infrastructure, a second (application) phase involving the application of technology within current teaching-learning processes, and a third (infusion) phase where teachers are able to use technology in different ways for innovative pedagogies. However, while many schools now have general ICT infrastructure in place (at least, for those with fairly privileged educational budgets), we are not yet at the point where infrastructure for corpora and DDL tailored specifically for younger learners is either available or appropriate for general use (Stenström, Andersen, & Hasund, 2002; Braun, 2007; PĂ©rez-Paredes, this volume). In addition, we are still a considerable way off the aforementioned second âapplicationâ phase, primarily because of a lack of knowledge or training for pre- or in-service teachers in understanding and using corpora on their own, let alone having developed any interest or proficiency in DDL materials or lesson preparation. Given that a teacherâs relative state of professional development in ICT is âthe most significant variable in explaining classroom ICT useâ (Gil-Flores, RodrĂguez-Santero, & Torres-Gordillo, 2017, p. 447) and that failure to integrate ICT use into the classroom can be generally explained by a lack of âconstructivist beliefsâ by in-service teachers (ibid) who are not aware or even afraid of what corpora can bring to the classroom, it is obvious much more work needs to be done to persuade teachers that DDL can potentially bring enormous benefits for their students as well as contribute significantly to their own professional development. This situation obviously precludes that the third âinfusionâ stage for the normalisation of corpora into the pre-tertiary classroom may still be some way off. Add to this the various complexities of getting funding or ethical approval for school-based research, and it is not difficult to see why DDL has not gained a footing in mainstream pre-tertiary education.
This volume therefore presents a range of studies that finally seek to address how corpus-based data-driven learning with young learners is finally emerging, how it is being applied, and how it can be infused into classroom practice in innovative ways. The studies represent a broad range of international perspectives on the use of corpora for DDL, including contributions from Europe, Asia, and Australia. These contributions have been produced by both experienced DDL experts and early-career scholars who have each attempted to document the successes as well as the challenges involved in applying DDL to pre-tertiary teaching and learning. The volume is divided into three distinct sections, each of which is described in the following.
Part 1: overcoming emerging challenges for DDL with younger learners
Fanny Meunierâs chapter discusses the need for constructive alignment for DDL involving younger learners in the L2 classroom. This requires that the curriculum and its intended outcomes, the teaching methods used, and the assessment tasks be consistently and coherently aligned â a coherence that so far is lacking in pre-tertiary DDL research. Corpora and DDL, Meunier argues, can greatly facilitate such alignment in terms of data-driven improvements to teachersâ content knowledge, pedagogical knowledge, and technological knowledge. This can be achieved by expanding on the types of tools and tasks currently used in DDL activities to include software other than traditional concordancers or corpus platforms in favour of applications more likely to be appealing to younger learners but which still allow for DDL to occur. In support, Meunier presents useful examples of such tools (e.g. PlayPhrase.me, LyricsTraining) and how they can be used for DDL, and the chapter ends with a call for more researchers and teachers to build on the affordances of new digital tools to DDL-ise them, leveraging tools âfrom the wild" into the classroom.
Oliver Wicherâs chapter presents a critical evaluation of DDL for younger learners from the perspective of foreign language didactics, which are theories of teaching and learning in classroom instruction. Wicher calls for pedagogic processing of corpora when teaching younger learners, simplifying the type of corpus data to be processed by the learners, as well as the processes by which young learners are expected to engage with such data. Wicher outlines two phase models currently in use for L2 lesson planning in secondary schools (PPP and TBLT) before outlining how DDL can be incorporated into each model, with useful accompanying sample activities. Wicher then calls for increased scaffolding to overcome individual differentiation in corpus abilities, with this scaffolding in the form of input enhancement (e.g. colour coding or highlighting/underlining concordance results), DDL assessments that vary according to learnersâ proficiency, and teacher manipulation of concordance results to avoid weaker learners becoming overwhelmed. Wicher also notes for DDL to truly succeed, the communicative dimension must not be ignored. Specifically, learners must be able to use the target structures in their own output afterwards.
Eva Schaeffer-Lacroixâs chapter focuses on the training of pre-service teachers in corpus use and DDL lesson planning in a secondary L2 German context in France, dealing specifically with the barriers to traineesâ appreciation and use of corpora for language learning. Her chapter outlines both technical and conceptual barriers to corpus innovations as evidenced in traineesâ attempts to create an L2 learning activity involving the use of corpora for DDL, with evidence from observations and interviews. Technical barriers encountered by the trainees included limited knowledge of corpus query syntax and corpus functions, leaving their lesson plans lacking in innovation and variety, while conceptual barriers including teacher beliefs abou...