Discourse data
How many words do you think you have spoken this week? How many have you heard spoken? (And how do you define a word â do âumâ/âermâ count?) How many words have you read this week? (Including those you may have read inadvertently, like labels and signs you encounter in passing.) How much have you written? (Including online/on your mobile phone.) Now imagine multiplying all the linguistic communication you have been involved in during this one week by all the weeks of your life so far, and then by all the people alive now, and then by all the human beings who have ever left any records â written, recorded as audio signals, or in any digital form. If we assume that most of this communication could be classified as âdiscourseâ, we get some idea of the âuniverseâ of data that might potentially be available to discourse analysts â and that is before we extend the range to include non-linguistic signs, such as photographs, soundtracks, emojis, and so on (see Chapters 8 and 9, this volume, which take stock of the âmultimodal turnâ that discourse analysis has undergone in recent years to account for the wider range of semiotic modes used in contemporary communication and the interactions between these).
So, when you set out to design a research project around âdiscourseâ, an early stage in the process will necessarily involve narrowing down your focus, and there are various ways that you might do this. Each chapter in this book takes a different approach to discourse analysis, and this often includes making different decisions about what kind of data to investigate. However, while such differences sometimes reflect contrasting perceptions about the very nature of discourse, in other cases the differences are more a matter of emphasis.
One way to set some boundaries around which data to collect is to identify some type or genre of communicative event or activity as your starting point. This could be, for example, informal conversations among friends, workplace meetings, political interviews or classroom interactions (e.g. Chapter 2, this volume), or it could be the virtual social gatherings enabled by digital media (e.g. Fester and Cowley, 2018). Data will then likely be restricted to detailed records of these interactions, in the form of recordings and transcripts of talk, or archives of messages exchanged, etc. More broadly, the starting point may not so much be a type of event, but rather a social setting, such as a school, small business, nursery, or community centre (e.g. Chapter 3, this volume), or even more formal institutional settings such as the Convention on the Future of Europe (KrzyĹźanowski and Oberhuber, 2007). In this kind of approach, the data may comprise a range of materials, including written texts, images, interviews with the people in the setting, field notes, and so on.
Some discourse analysts are particularly interested in the ways that different modes of communication influence the way it occurs. I know of several researchers who choose to explore exclusively written texts because of the challenges posed by working with speech. These include, for example, taking into account all the paralinguistic and prosodic features of spoken language which are very difficult to capture in transcriptions (see e.g. Cook, 1990). On the other hand, for some researchers, this is exactly what interests them â how the different components of face-to-face communication interact with one another. So if your interest lies primarily in one or more modes of communication, this could entail contrasting two kinds of data, such as, for example, both authentic informal conversation and scripted talk that aims to simulate casual interactions. Alternatively, your interest in a specific mode might lead you to restrict your data to one kind of mode, such as telephone conversations/emails/formal letters/Facebook posts/tweets: these are all examples of how a focus on the mode of communication leads to the selection of particular types of discourse from the vast range of potential data available for a discourse analysis project.
An aspect of discourse that intrigues some researchers is how it comes to take the forms it does. For some analysts, this line of research entails collecting very large quantities of data in order to reveal patterns in the way words and phrases behave, including as they co-occur with one another (see Chapter 7, this volume). This is particularly interesting because users of language themselves are often not aware of these patterns. Other analysts look from the other end of the telescope, so to speak, zooming in on the internal processes that must be happening within the minds of language users to account for the formation of particular concepts (see Chapter 6, in this volume). As Hart explains, such âcognitiveâ approaches tend to use as data texts that at least have the appearance of being âmonologicâ (i.e. having been produced by one voice) rather than conversations, which are inherently âdialogicâ (i.e. produced in more interactional settings). I return to the issue of the production of data in the next section. Some analysts claim that these âtwo ends of the telescopeâ are inevitably at odds with one another, but others believe that they need not be. For example, Hoey (2005) seeks to account for a central phenomenon associated with corpus analysis, namely âthe recurrent co-occurrence of wordsâ, and argues that it is a psychological concept, âprimingâ, that explains this. So his claim is that âthe mind has a mental concordance of every word it has encounteredâ which âcan be processed in much the same way that a computer concordance isâ (2005: 11; see also Gries, 2005, 2006). These examples begin to point to another of the issues explored in this book: how much data is needed for different kinds of analysis, and does the analyst measure phenomena (quantitative analysis) or interpret them (qualitative analysis), or does the research, as is often the case, involve a combination of both?
Yet another point of departure in deciding on the kind of data to collect is the identification of a social problem, such as racism or gender inequality, which discourse plays a part in creating and sustaining. Again, this perspective and those summarised above are not mutually exclusive. The point is just that the primary motivations of the analysts may be different. That is, while one researcher investigates, say, casual conversations among friends in order to better understand turn-taking procedures in their own right, another may analyse the same data with a view to exploring gender dynamics and the way some speakers assert dominance over others. One form of discourse analysis directly concerned with issues of power and inequality is critical discourse analysis (CDA), a leading proponent of which is Norman Fairclough. Fairclough, and others working in this tradition, take care to point out that CDA is not a particular method or subdiscipline of discourse analysis, since a critical perspective is possible in any approach to discourse analysis. The relevance to us here is that CDA is discourse analysis that âexplicitly defines and defends its own sociopolitical positionâ (Van Dijk, 2001: 96). So, in this tradition, the starting point is a perceived social problem and the selection of data is guided by a concern to highlight and address âthe role of discourse in the production and reproduction of power abuse or dominationâ (Fairclough, 2001: 96). For this reason, many CDA projects select as data the discourse that is produced by âeliteâ social actors, agencies, and institutions, such as politicians or the press, whose discourse, arguably, exerts the most influence over society. For many researchers in CDA the ultimate goal is to resist power and inequality as they are expressed in, and enacted through, discourse (see below for further discussion of what it means to take a critical stance).
Finally, for now, it is worth recognising that there has been a âdiscursive turnâ across the social sciences, and with this an increasing degree of collaboration between discourse analysts and researchers in other disciplines. For example, I gained access to a data set of transcriptions of parliamentary discourse (nearly 1000 sessions of Prime Ministerâs Questions) through a collaboration with a political scientist. His interests are primarily in political processes and how these are enacted in these events, and our joint analyses have focused sometimes more on these issues (Holden Bates and Sealey, 2019) and sometimes more on the pragmatics of the interactions (Sealey an...