Part I
Measuring populist ideas
1 Textual analysis
Big data approaches
Kirk A. Hawkins and Bruno Castanho Silva
One of the main challenges in studying populism in comparative perspective is finding ways of measuring it across a large number of cases, including not just multiple countries but multiple parties within countries. Most large-N studies classify cases by fiat, based on literature reviews or the judgments of country specialists (see, for example, Doyle 2011; Levitsky and Loxton 2013; Bustikova 2014; Mudde 2007; Mudde 2014). The problem with the first approach is that it often relies on second-hand literature instead of primary sources, and has little room for testing reliability. The problem with the second is that it depends on experts’ different conceptions of populism, which frequently diverge from the ideational approach adopted in this volume. Ultimately, these approaches struggle to provide an objective basis for comparing cases across different contexts.
While other chapters overcome this challenge by implementing techniques such as surveys of political elite attitudes or systematic expert surveys, here we provide a head-to-head comparison between two techniques of textual analysis: holistic grading and computer assisted textual analysis through supervised learning. Generally speaking, textual analysis is useful for anyone using the ideational approach to populism because it is focused on ideas, and because the ideas of political elites can be hard to measure through anything except texts, such as speeches or manifestos, that the politician produces for other audiences besides the researcher. Most of the earliest efforts at measuring populism objectively were, in fact, textual analyses by scholars favoring some kind of ideational definition (Armony and Armony 2005; Jagers and Walgrave 2007; Hawkins 2009). Here we explore a particularly difficult version of this challenge by adapting textual analysis to the measurement of populism across party systems in large numbers of countries. Much of the current methodological literature on textual analysis promotes automated textual analysis as an ideal means of dealing with “big data” issues (Quinn et al. 2010; Hopkins and King 2010; Laver, Benoit, and Garry 2003; Lucas et al. 2015). However, populism is a different sort of idea than the themes being measured in much of that literature, and human techniques might be more appropriate.
To help resolve this issue, and to provide data that we use in other chapters, we test a validated human-based approach to measuring populism—holistic grading (Hawkins 2009)—and apply it to 144 parties from 27 countries in Europe and the Americas. Specifically, we look at campaign documents—electoral manifestos and speeches by party leaders—from all main parties in each political system to observe how populist each actor is, and compare that to a range of international cases. In the process we create the first cross-regional dataset classifying entire party systems. With these data in hand, we first observe how populism is distributed across the regions in this study, and how specific parties are classified. Next, we compare these results to supervised learning methods that can, theoretically, also confront the challenge of categorizing large numbers of texts across multiple languages. We find promising results for both.
Holistic grading
We approach the challenge of measuring populism with a very specific definition in mind, the ideational one. This defines populism as a discourse dividing the political world into two camps: the good, identified with the virtuous will of the common people; and the evil, embodied in a conspiring elite. Because populism sees the political system as having been subverted, it calls for a “systemic change,” or liberation of the people from the grip of the elites. It also tolerates undemocratic means to achieve this goal since, in this framing, the elites are thieves who do not deserve a fair treatment, and the enforcement of the people’s will should not be blocked by formalities and institutions.
The ideational definition lends itself to operationalization and measurement, because it identifies elements that should be present in a discourse for it to be populist. Following it, researchers have used different content analysis methods to measure populism in recent years. Jagers and Walgrave (2007) test a dictionary-based approach to classify populist parties in Flanders, which is extended to three more countries in Rooduijn and Pauwels (2011). This technique consists in defining a dictionary of “populist” terms and classifying documents based on their frequency. In contrast, Rooduijn, de Lange, and van der Brug (2014) use human-based content analysis of party manifestos from five European countries. This approach has paragraphs as units of analysis, and uses trained coders to classify each one as populist or not, with the aggregated proportion of populist paragraphs being the party score. A third comparative approach has been put forward in Hawkins (2009), and consists of holistic grading. There, chief executives’ speeches are coded as a whole, without breaking them down into words or paragraphs.1
From these alternatives, this paper uses the third. Although the Introduction to the volume raises methodological concerns regarding the accuracy of other techniques, our reasons here are more practical. The dictionary-based technique demands a high knowledge of each specific country and time period for the selection of relevant terms. It is feasible in single case studies or small-n comparisons, as in the chapters by March (Chapter 2) and Šalaj and Grbeša (Chapter 3), but becomes less so when a larger number of countries are included. Of the other two techniques, both depart from a similar definition of populism and could potentially be used for the purposes of this study. Hawkins’ approach has the upper hand, however, for having been tested and validated across a large number of countries and time-periods. The original study (Hawkins 2009) included 40 contemporary and historical presidents and prime-ministers from Latin America, Europe, and Asia, while a second round was done with chief executives from Eastern Europe and Central Asia (Hawkins 2013). Furthermore, because it works at the level of whole texts, it can be used to generate data relatively quickly, at least for human-coded analysis. The technique by Rooduijn et al. (2014) has not yet been applied outside of France, Italy, Germany, the Netherlands, and the United Kingdom, and while it has the clear advantage of generating more fine-grained data, it is much more time consuming.
Holistic grading was developed in educational psychology for assessing students’ writing (White 1985; Sudweeks, Reeve, and Bradshaw 2004). It is a human-based approach that evaluates the text as a whole. Coders are trained to assign scores based on the elements of the concept and a set of anchor texts defined as examples for the each score in the scale. In our case, coders were trained in English on the concept of populism, and the set of training documents and anchor texts were also in English. In order to ensure the validity of our measure, the anchor texts were drawn from a variety of regions and capture different ideological flavors of left and right.2 The training emphasized that the most important dimension of populism is the notion of a reified will of the common people, and that this reified people has to be defined in opposition to an “elite,” which is powerful and oppressive. Therefore, even if there was a great deal of anti-elitism in the text, coders could not assign a score of 1 unless there was also a reified will of the people. As in Hawkins (2009, 1062), scores come from a 0–1–2 scale defined as follows:
0 A speech in this category uses few if any populist elements. Note that even if a manifesto expresses a Manichaean worldview, it is not considered populist if it lacks some notion of a popular will.
1 A speech in this category includes strong, clearly populist elements but either does not use them consistently or tempers them by including non-populist elements. Thus, the discourse may have a romanticized notion of the people and the idea of a unified popular will (indeed, it must in order to be considered populist), but it avoids bellicose language or references to cosmic proportions or any particular enemy.
2 A speech in this category is extremely populist and comes very close to the ideal populist discourse. Specifically, the speech expresses all or nearly all of the elements of ideal populist discourse, and has few elements that would be considered non-populist.
Because coders in earlier studies reported that it was often difficult to choose between these blunt categories, this time we instructed them to give decimal scores, and said that 0.5 rounded up to a categorical 1, and 1.5 rounded up to a categorical 2, and that they should consider these qualitative differences when assigning decimal points. After the training, coders received the texts, whether speeches or manifestos, and coded them in their original language (most coders were native speakers). For each text, the coder filled out a rubric that included a score, representative quotes, and a qualitative summary of their judgment. Each coder worked independently and filled out one rubric per document; afterwards, final scores were discussed with the other coders and the coordinator to clarify questions and check for possible misunderstandings.
Sampling
Two innovations were introduced to this sample relative to previous studies using holistic grading. First, because our goal was to capture the level of populism in the party system, the sample was expanded from chief executives to all major parties (usually all those receiving more than 1% of the vote). Second, in order to capture as many parties as possible, we focused on coding party manifestos rather than speeches. Manifestos are a common choice in analyses of partisan ideology but also in the populism literature (Rooduijn and Akkerman 2017; Rooduijn and Pauwels 2011; see also March’s chapter in this volume). The main reason is that these documents explore a party’s discourse as an institution, which may be distinct from that of its top candidates. Also, together with speeches, manifestos are the documents most comparable across countries: almost everywhere parties produce some kind of election program. This means we are looking for populist discourse in documents that are produced and made public with similar purposes across cases. Moreover, as a practical note, manifestos are more accessible than speeches across time and space. It is very difficult to find speeches for defeated candidates (and sometimes even for winning ones) more than one or two election cycles back, while manifestos are usually available for longer periods. We think this makes manifestos an essential text for studies of whole party systems.
Some might question the relevance of party manifestos in certain contexts. In many countries, especially in Latin America, few voters actually bother to read them, and people are often unaware that parties even have them. There are a few standard answers to this objection. First, there is the empirical observation that parties actually do implement a fair share of what they promise in their electoral manifestos (e.g. Bara 2005; Budge and Hofferbert 1990; Naurin 2014). Therefore, even if no one is reading them, manifestos provide insights into what political parties are thinking ...