1 Researching and Theorising Multilingual Texts
Mark Sebba
Introduction
To say that written multilingual discourse is under-researched is an understatement. Since the 1970s a large amount of research in the field of bilingualism has focussed on the mixing of languages in discourse, in particular code-switching and related phenomena, variously called codemixing, code-shifting, language alternation or language interaction. Most of this work has studied spontaneously produced spoken data, usually described as âconversational code-switchingâ. Much of this research has been done on spoken code-switching in informal contexts, but there has also been substantial investigation of institutional settings such as classrooms and offices.
A much smaller body of research has concerned itself with the phenomena of written multilingualism. In some ways, this is surprising. Undoubtedly, there is a monolingual bias in most industrialised societiesâthe regulatory tendency which validates only âpureâ language and regards language mixing, written or spoken, as illegitimate or simply ignores it. But in spite of that, there is a great variety of written data which involves more than one language within a text. There is data both old and new: from ancient and medieval times, from traditional genres such as medical texts and formal letters, from recent, still-developing genres such as advertising and email and from a range of text types in between. Despite the variety of data, written language mixing remains relatively unexplored and under-researched, a point made by many of the recent authors on this subject.1 It would be misleading to say that there has been âhardly anyâ research in this area, since there is in fact a considerable body of work, some of it by linguists, some by specialists in literature and some by people who are both. However, it is distinguished by a number of characteristics:
- It has no independent theoretical framework; all linguistic research in this area to date which is not purely descriptive, has drawn on theoretical frameworks originally developed for spoken code-switching research, or occasionally on theoretical frameworks from other disciplines.2
- Published research tends to take the form of stand-alone papers, which typically deal with a single set of data. Very few researchers have produced more than one or two papers on this topic, suggesting that for most of them their main interests are elsewhere. Book-length treatments are extremely rare.3
- A lot of research on written mixed languageâit is difficult to know how muchâremains unpublished. A study of code-switching using a corpus of magazines, personal letters or more recently, email messages, is a popular subject for MA dissertations in my department and no doubt in many others. These unpublished writings alone would probably add up to a substantial data resource if they were accessible, but most are not.
Why has the study of written mixed language been neglected? Several reasons suggest themselves. Firstly, in spite of rejecting prescriptivism, linguistics and its related academic disciplines have tended to have a pedagogical focus and hence, to be normative. This has produced a monolingual bias which makes it difficult for researchers who are identified with the study of a specific language (âAnglicistsâ, âGermanistsâ, âHispanistsâ etc.) to deal with texts which consist to a large extent of a language different from the one they are supposed to specialise in.
Secondly, in comparison with code-switching in the spoken mode, which has been the subject of numerous theoretical treatments and for which a number of competing and complementary theoretical approaches are available, written code-switching has been little theorised. The lack of a coherent framework which can provide a context for discussing and analysing the data ensures that many of the studies remain mostly descriptive, or confine themselves to a comparison with spoken data. The fact that there is no independent, theoretically informed field of âwritten multilingual discourse studiesâ further means that some work of publishable standard (such as some MA dissertations) probably remains unpublished and, for the most part, inaccessible to other researchers.
I will argue here for a new approach to written mixed-language discourse, with three key points: (1) the study of written mixed-language discourse needs to be situated within a broader field which deals with the semiotics of mixed-language texts in the broadest sense; (2) the production and reading of mixed-language written texts needs to be studied within a literacy framework, in order to understand the acts of writing, reading and language mixing within the context of literacy practices of which they are a part; (3) visual and spatial elements of the written form potentially provide important contextualisation cues (in other words, are an integral part of the interpretation of the message) and therefore need to be included in any framework which seeks to do justice to the semiotics of written mixedlanguage texts, even though they may not be relevant to all such texts.
Before discussing these points further, it will be useful to have a review of the development of research into written language mixing.
A Short History of Research into Written Multilingual Texts
Much if not all of the earliest linguistic research on mixed-language texts was actually done on representations of speech embedded within texts belonging to a written genre. This line of research began with an important study by Birgit Stolt (1964) on the German-Latin mixing in Lutherâs Tischreden, Lutherâs mealtime conversations recorded by his followers who dined with him. Early studies usually treated the written texts as a source of evidence about spoken practicesâthe only available evidence, in the case of texts predating modern sound recording technologyâand tended to be concerned with the kinds of syntactic and pragmatic issues which also preoccupied researchers of spoken code-switching at the time. For example, Timmâs 1978 study of French-Russian code-switching dialogue in Tolstoyâs novel War and Peace deals with syntactic constraints on switching, using the vocabulary of phrase structure analysis. For most of the earlier researchers, the written data was considered a secondary source, complementary to spoken data but possibly somewhat less reliable as an indication of bilingual behaviour. Such representations of speech were, of course, likely to be stylised to some extent and affected in poorly understood ways by the shift of modality from spoken to written. The focus on spoken data as the primary object of interest is also seen in more recent research; however, as knowledge about spoken code-switching practices has expanded and become more robust and theoretically informed, researchers have begun to carry out studies comparing documented spoken practices and their representations in texts of different sorts. For example, Moyer (1998) studied a humorous newspaper column in Gibraltar and compared the Spanish-English code-switching represented there with audio-recorded data. Callahan (2004) makes an extensive study of Spanish-English dialogue in prose fiction, using the Matrix Language Frame model developed by Carol Myers-Scotton (1993). Callahan concludes (2004, 2) that âthe successful application to a written corpus of a model developed for speech validates the use of written data, and shows that written codeswitching is not inauthentic.â Thus, gradually, written data has become more ârespectableâ as data in its own right in structurally-oriented code-switching research.
For researchers whose interest is in historical bilingual practices, comparison with actual spoken data has never been possible. Although the finding that written representations of code-switching can reliably reproduce spoken practice is of interest to those working in this area, their focus has always necessarily been on the text itself as a literate practice. Stoltâs work, mentioned above, was the pioneer in this area. Although clearly of interest, this area seems to have lain fallow for a few decades before attracting attention again from researchers like Laura Wright (on linguistically complex medieval accounting practices), PĂ€ivi Pahta (writing on medieval scientific texts) and Herbert Schendl (who has written extensively for over a decade on language mixing in medieval documents).4
While a focus on syntactic constraints continues to be the main research interest for some researchers looking at written representations of spoken discourse (e.g. McLelland 2004), others, like Carla Jonsson, have taken a wider perspective, to look at the role of code-switching in the text as a whole. For example, Jonsson (2005, 252â3) concludes that code-switching serves local as well as global functions in the dialogue of the Chicano plays she studies: âSome of these local functions correspond to functions in the typology suggested by Gumperz as well as other typologies developed for oral CS. This makes it possible to argue that although Gumperzâ typology was developed to account for C[ode]S[witching] in speech, it also proves to be relevant in the analysis of code-switching in Chicano theater, i.e. in writing intended for performanceâ.
Moreover, Jonsson finds that language mixing in these plays also has global functions across the text as a whole, relating to power relations and identity construction. âCS and C[ode]M[ixing] are used to resist, challenge and transform power relations and dominationâ. Furthermore, they are used âto construct and reconstruct a multifaceted and complex Chicano identity that draws on, at least, two cultural environments [ ⊠] Most importantly however, CS and CM allows for the reflection, construction and reconstruction of a hybrid/third space identity, which is fluid and always in transition.â (2005, 254).
The interest here is therefore not just in the local functions of language mixing, but in the whole text as a genreâembedded in, and characteristic of, a linguistically hybrid culture.
The notion of mixed-language written genres worthy of study in their own right is itself not newâresearch into historical documents has been carried out for many years, as mentioned above. However, the advent of the Internet has resulted in an increased interest in written code-switching, and a number of studies of specifically digital genres, e.g. Hinrichs (2005, 2006) (on code-switching in emails), McLellan 2005 (on online discussion forums), Montes-AlcalĂĄ 2007, (on bilingual blogging), Lam 2009 (on instant messaging), Paolillo (1996, forthcoming), Androutsopoulos (2006, 2007) and Sebba (2003, 2007). Samu KytölĂ€âs chapter in this volume gives a more detailed overview of some of this research.
Most researchers in this area have drawn, more or less extensively, on the available theories of spoken code-switching. In this area, undoubtedly, the classic and pioneering work was that of Gumperz (Blom and Gumperz 1972; Gumperz 1982) and almost all treatments, apart from those which are purely structural and syntactic, owe a substantial amount to his research. Among his central contributions to research in this area are the notion of code-switching as a contextualisation cue, the distinction between situational and metaphorical switching and a typology of discourse functions of code-switching. All of these concepts are potentially applicable to written language alternation, and most have been applied in some way by researchers in this area.
Since the 1990s two frameworks have predominated in sociolinguistic research on spoken code-switching. The Markedness Model (Myers-Scotton 1993) accounts for code-switching in terms of ârational choicesâ by speakers who choose a code from their repertoire to activate sets of ârights and obligationsâ associated with that code. The concepts of the Markedness Model can be applied, at least to some extent, to the more conversationlike and interactive written genres, e.g. online chat. However, it would be harder to apply to other types of written data which are less interactive, or where one or both of the interacting parties is anonymous.
The conversation analysis model (Auer 1984, 1995, 1998; Li Wei 1998, 2005) can likewise be adapted to work with more conversation-like interactive data, but because of the crucial role played by interlocutorsâ responses within this approach (i.e. the central role of sequentiality) it is impossible in practice to apply it in any useful way to non-interactive written data.
Written Language Mixing: some Analytical Issues
The majority of studies of written multilingual discourse to date, to the extent that they deal with the motivations for switching, have applied one or more of the three models aboveâthose associated with Gumperz, Myers-Scotton or Auer. However, as none of these models was developed originally to deal with written texts, the difficulties which researchers face in trying to apply them to data in a different medium can be considerable. Furthermore, there is a danger that applying concepts developed as a way of approaching and explaining spoken discourse will have the effect of limiting the research on written discourse, imposing constraints on the types of phenomena which can be studied or which even appear to be worthy of study.
The written medium encompasses a great diversity of genres, most of which do not correspond to spoken genres in spite of overlap in some cases. Furthermore, the focus within bilingualism research on spoken codeswitching and, to a much lesser extent, its written counterparts, has led most researchers to concentrate on written text as textâin other words, as strings of words on the page or screenârather th...