Historical Perspectives
One can only examine historical perspectives if one has a sense of when âhistoryâ begins. For the purposes of this chapter, I take the 1960s as the beginning of the field of second language acquisition (see Thomas 1998; Gass, Fleck, Leder, and Svetics 1998 for a debate on the fieldâs appropriate early date). At that time and even slightly prior to that time, most work in the field was based on a need to find a pedagogical means to help people learn a language beyond their first. Hence, the âmethodologyâ in the early part of the 20th century was a pedagogical methodology, not a research methodology, the latter being the focus of this chapter. This chapter is further restricted primarily to quantitative methodology, given space limitations, although differences in quantitative/qualitative approaches are discussed at the end of the chapter.
In order to place research methodology in context, I looked at articles that were published in the early days of the field. To do that, I considered research published in one journal, Language Learning. This journal was selected because it was the only journal that existed during the time-frame of interest that was specifically devoted to the discipline of language learning, as is made clear in the journal title. I looked at the issues from 1967 through 1979 to get an idea of trends in research, coding the articles into five specific categories: pedagogy, descriptive, data analysis, testing, and position papers, with an additional category (other) for articles that were not easily classifiable into one of the five main categories. The category most relevant to this chapter is âdata analysisâ because that is the category for which data were collected and analyzed, the sine qua non of empirical research. It is important to note that prior to this time, there was little elicited data used in SLA research; many publications focused on teaching and did not present data based on learning. Further, early research in SLA was based on research from either linguistics or from child language acquisition, in terms of theoretical questions posed and research tools used.
What I found was that in this 17-year period, of the 237 articles considered, most fell into two categories: pedagogy (55) and data analysis (57). A closer look reveals a major shift in emphasis in the early 1970s. In particular, 1972 appears to be a watershed year for data analysis. In that year, five articles contained original data with analysis (compared to four in the other dominant category of pedagogy), whereas in the previous two years combined (1970 and 1971) there had been 16 in the pedagogy category and only one in the data analysis category. To further this point, in 1972, the five data analysis articles were nearly as many as had appeared in the prior five years combined (which was six). Additional evidence for the importance of this period in the use of original data and original analyses is the fact that in the subsequent six-year period (1973â1979), there were 46 articles in that category. In other words, in the early 1970s, and even more precisely, around 1972, the field appears to have taken an important shift in emphasis.
Because data analysis was limited during the time leading up to the 1970s, research methods were not a focus of attention and one is hard-pressed to find discussions on the topic. Gass and Polio (2014) posed the question as to why this shift in focus occurred. We suggested that the issue of types of data for analysis was problematized in the 1972 Interlanguage article by Selinker. In that article, he made numerous claims that relate to what data are allowable; the most well-known of these is presented below:
⊠observable data to which we can relate theoretical predictions: the utterances which are produced when the learner attempts to say sentences of a TL âŠ
(Selinker 1972, 213â214)
According to this claim, only utterances produced in meaningful performance situations are of use in understanding how languages are learned. In particular, research is limited to three types of data:
- utterances in the learnerâs native language produced by the learner,
- interlanguage utterances produced by the learner, and
- target language utterances produced by native speakers of that target language.
Important to this discussion is that theoretical predictions in a relevant psychology of second language learning must be the surface structures of interlanguage sentences. As part of this, Selinker explicitly highlights two data types that are unacceptable: grammatical judgment data and nonce data. The former are not relevant because researchers âwill gain information about another system, the one the learner is struggling with, i.e., the TLâ (213). And the latter are not relevant because âbehavior which occurs in experiments using nonsense syllablesâ does not produce meaningful performance (210), where meaningful performance is defined as âthe situation where an âadultâ attempts to express meanings, which he may already have, in a language which he is in the process of learningâ (210). In sum, â⊠data resulting from these latter behavioral situations [including nonsense syllables] are of doubtful relevancy to meaningful performance situations, and thus to a theory of second-language learningâ (210). These statements provide an important foundation for discussions of research methodology in second language acquisition by making explicit claims of what data are possible and which are not.
During the 1960s and 1970s there were few discussions relating particularly to research methodology. The few notable exceptions centered on issues of grammaticality/acceptability judgments (Schachter, Tyson, and Diffley 1976; Corder 1973; Hyltenstam 1977), all of which took the position, contra to Selinkerâs claims, that their use was crucial in that certain questions about second language knowledge (as opposed to use) could only be answered through forced data elicitation, such as intuitional data. Within empirical studies, judgment data were typically collected through a forced binary choice (grammatical versus ungrammatical) of a set of sentences, and learners were often asked to modify the ungrammatical sentences to make them grammatical.
The tide began to turn as research methodology came into focus in the 1980s with the publication of a book specifically designed to address issues in research methodology and designed for an applied linguistics audience (Hatch and Farhady 1982). In the mid-1980s, other books and treatises became more prevalent, with discussions focusing on a wide range of topics related to research methods (Cook 1986, on experimental methods; Henning 1986, on quality in quantitative research; Chaudron 1986, on the need for quantitative and qualitative research). This was followed by general textbooks on research in second language learning (Brown 1988; Chaudron 1988; Seliger and Shohamy 1989; Hatch and Lazaraton 1991; Johnson 1992; Nunan 1992). At about the same time as the appearance of general textbooks, there came a number of journal articles and books dealing with specific topics (e.g., t-tests, Brown 1990, Siegel 1990; power and effect size, Lazaraton 1991, Crookes 1991; Varbrul, Young and Bayley 1996; analysis of frequency data, Saito 1999, Young and Yandell 1999; classroom research, Nunan 1991; structural equation modeling, Matsumura 2003). These are early (and current) signs of the fieldâs attempt to create cogent arguments about issues of methods and analysis, with a focus on quality, to which I turn below.
Throughout the history of research methods, there was an awareness of the lack of familiarity by researchers and consumers of research with methods of research, the value of experimental research, and techniques for data analysis (Ingram 1978; Cook 1986; Lazaraton, Riggenbach, and Ediger 1987; Brown 1991; 1992; Lambert 1991). Becoming aware of shortcomings is a sign of the fieldâs initial attempts to create standards relating to methods and analysis.
A final indication of the role of methods comes from statements from leading journals in the field. In 1992, TESOL Quarterly, as part of their general guidelines for article submission, introduced a section with the title âStatistical Guidelinesâ (see also Chapelle and Duff 2003, where qualitative methods are included). They did this to ensure âhigh statistical standardsâ (794) for publication in the journal. Among the topics considered important were issues of reporting (see also Polio and Gass 1997), including an appropriate layout of results, along with a discussion of assumptions underlying the use of particular statistical tests. In the following year, Valdman (1993) included an editorial comment in Studies in Second Language Acquisition in which he brought to the attention of the journal readership the importance of replication (related to the issue of reporting mentioned earlier; see also Ellis 1999, editorâs statement). Valdman took the issue a step further by introducing a replication section in the journal. Language Learning was also an early leader with regard to rigor. In 1993, a new directive for contributors to the journal appeared (âInstructions for Contributors,â 151): âManuscripts considered for publication will be reviewed for their presentation and analysis of new empirical data, expert use of appropriate research methodsâŠâ (emphasis added). That same journal became even more stringent with issues of reporting and stated in 2000 that all submissions to the journal were required to include effect sizes for all major statistical comparisons. Other journals in the field have recently followed suit (e.g., Language Learning and Technology, The Modern Language Journal, Language Teaching, and TESOL Quarterly). With regard to standards and reporting, the emphasis has been on quantitative methods, but qualitative guidelines have also received attention (e.g., Chapelle and Duff 2003, as well as treatments in various research methods books; Dörnyei 2007; Gass and Mackey 2007; Mackey and Gass 2005; 2012). When considering guidelines for research, a slightly different set must be acknowledged, as wellâthat for ethical research, an early statement of which came in 1980 from a TESOL Research Committee (Tarone 1980). All of these guidelines (see also Loewen and Gass 2009) form an important part of the development of the field.
A major step forward in the field of SLA was the establishment in 1997 of a series of books titled Monographs on Research Methodology by Lawrence Erlbaum Associates under the editorship of Susan Gass and Jacquelyn Schachter. This series continues to this day (under the editorship of Susan Gass and Alison Mackey) with books on communication tasks (Yule 1997), stimulated recall (Gass and Mackey 2000), conversation analysis (Markee 2000), case study research (Duff 2007), priming methods (McDonough and Trofimovich 2009), questionnaires (Dörnyei with Taguchi 2009), think-alouds (Bowles 2010), and reaction time research (Jiang 2011).
Core Issues and Key Findings/Research Approaches
Research methods in the field of SLA have continued to evolve with greater statistical and elicitation sophistication as a result. Norris and Ortega (2003), in an article related to measuring acquisition, asked the basic question of what counts as acquisition. They argued that SLA is not a monolithic phenomenon and a definition of acquisition must be understood in the context of the theoretical perspective being investigated. Not only does the basic question of a definition of acquisition depend on oneâs theoretical orientation, but data types and research methodologies also differ depending on the questions asked and the theoretical perspective taken. For example, those who have focused on the acquisition of morphosyntax have often relied on intuitional judgments (either binary, as in grammatical or ungrammatical, or fine-tuned comparisons, as in magnitude estimation, Bard, Robertson, and Sorace 1996). Those who have taken a sociolinguistic perspective rely on natural data or, at times, survey data (e.g., interviews and/or questionnaires, cf. Dörnyei with Taguchi 2009). Those involved with the role of interaction and corrective feedback make use of tasks to elicit appropriate data. Phonologists use acoustic measurements, relying on instrumentation to assess perception and production, psycholinguists use a wide range of techniques to better understand processing, and those concerned with on-line thought processes have utilized a range of verbal report data. In sum, many tools are available to second language researchers as they seek to better and more deeply understand how learning takes place.
As noted above, early research drew its elicitation tools primarily from linguistics and, to a lesser extent, from child language acquisition. Linguistic-based research has become less prevalent, with current research methods relying to some extent on methods from other fields, such as psychology/psycholinguistics, social psychology, and education (e.g., action research), and with other methods developing out of second language questions themselvesâfor example, the line of research known as input/interaction (see Gass and Mackey 2007 for a detailed discussion of ways of conducting research in second language acquisition, along with assumptions underlying research types). What stands out is the increased scrutiny of design and analysis, an area that I deal with in the following section.