Part One objectives
- To set out the scope of this book in terms of what you can expect to learn from reading it (and, of course, doing the exercises!).
- To define some key terms and concepts we will be using as we go forward.
- To begin to think about āProgramming-as-Social-Scienceā as a unique approach to understanding the world which integrates (unsurprisingly) Python programming techniques with the work of social science.
Welcome to the book! Part One intends to get everybody on the same page and kick us off by outlining all of the core stuff of computer programming, and why we might be interested in learning how to program as social scientists. Weāll be asking and answering questions going right back to āso what is computer programming anyway?ā, as well as figuring out ways of looking at and using programming that are helpful and valuable to us as social scientists. This is going to form the background to the later chapters where weāre learning about how to read and write Python computer code ā the things we cover here aim to give context to what youāre learning later.
So, while Part One doesnāt directly deal with the mechanics of how the Python programming language works, it is still really useful in terms of keeping our focus not just on the idea of programming as an instrumental means-to-an-end type of deal, but thinking about programming in such a way that it can be leveraged for social scientific purposes.
Letās get to it!
Both āsocial scienceā and ācomputer programmingā are slippery terms that cover a lot of diverse topics and research practices. This makes it difficult to pin a date on when either can be said to have started. But by any measure, both have been around for a long time. Arguably, modern social science emerged out of the Enlightenment period in Europe in the mid-seventeenth century, with figures such as Thomas Hobbes and John Locke producing work on philosophy, morality and politics that went on to inform the more explicitly proto-sociological developments of eighteenth- and nineteenth-century thinkers like Adam Smith, Henri de Saint-Simon and Auguste Comte. Perhaps surprisingly, given how we might think of computers being a relatively recent technological development, itās generally agreed that the computer program can be dated as far back as 1843, with Ada Lovelaceās creation of the first algorithm which formalised a set of instructions for computing a sequence of āBernoulli numbersā to be carried out via Charles Babbageās (then designed but unbuilt) āAnalytical Engineā.
But despite the fact that both social science and computer programming are long-standing enterprises, itās only far more recently that academics have begun to explore how the two might be combined. Social science ā especially those areas of it that focus on topics where collaboration is inevitable, such as humanācomputer interaction and Computer-Supported Cooperative Work ā has already produced more than a few studies of the work of programming (see the further reading section below). In these studies, the focus has been on the social aspects of how programmers manage their working together: how the work of programming gets done by programmers. However, doing a study of computer programming is not quite the same as using computer programming to do a social scientific study ā why is it that weāre only now starting to think about what computer programming might offer as a tool and skill for social scientists?
Further reading
Social Studies of Programming
Graham Button and Wes Sharrockās mid-1990s works in the field of Computer-Supported Cooperative Work are great examples of the kinds of interest and approach a social science researcher might bring to studying programming as an activity:
Button, G. and Sharrock, W. (1994) Occasioned practices in the work of software engineers. In M. Jirotka and J. Goguen (eds), Requirements Engineering: Social and Technical Issues. London: Academic Press, pp. 217ā240.
Button, G. and Sharrock, W. (1995) The mundane work of writing and reading computer programs. In P. Ten Have and G. Psathas (eds), Situated Order: Studies in the Social Organisation of Talk and Embodied Activities (Studies in Ethnomethodology and Conversation Analysis No. 3). Washington, DC: University Press of America, pp. 231ā258.
The shift is largely motivated by the fact that digital data and the internet as a site of everyĀday social interaction have significantly changed the playing field of the social sciences. Within this emerging field of āDigital Social Scienceā, both the topics and methods associated with social science research have become increasingly computer-oriented. As people interested in studying social life, we canāt ignore that a lot of what people get up to in everyday society is organised around and through the use of things like computers, the internet, search engines, entertainment-streaming services, social media, smartphone apps, and so on. This is hugely important across the whole of social science; as Housley et al. note, focusing on the digital world offers social science the potential not only to extend its reach into new forms of sociality, but also to deepen our understanding of existing forms: āthese technologies and their allied data have the potential to ādigitally-remasterā classic questions about social organization, social change and the derivation of identity from collective lifeā (2014: 4). Moreover, if we want to find out about any of these things as social scientists, we will inevitably have to draw on a variety of digital tools and methods to do so. Even ethnography, a method more typically associated with the physical presence of a researcher within a participantās setting, is not exempt from these effects; as Hallett and Barber (2014: 308) note, ethnographic researchers āneed to reconceptualize what counts as a field site ā¦ studying a group of people in their ānatural habitatā now [often] includes their āonline habitatāā. So, engaging with computational and digital tools and data has already become integral to what social scientists do. This book seeks to extend that thinking and demonstrate (among other things) that learning how to program can significantly enhance how social scientists can think about their studies, and especially those premised on the collection and analysis of digital data.
Definitions
Digital Social Science/Computational Social Science
To say there is a single field called āDigital Social Scienceā is a bit misleading ā in reality, this term is so broad as to cover lots of different, constantly shifting forms of philosophical orientations, disciplinary commitments, research approaches, topics and methods. So Iāll use the term āDigital Social Scienceā in an intentionally very loose way which doesnāt make any kind of futile attempt to unify all of these things, but which still serves as a shorthand for any kind of research that somehow involves using digital tools and methods (whatever those may be) to explore digital topics (whatever those may be!).
However, just because you might see the term used elsewhere, it is worth noting that there does already exist a specific area of inquiry called āComputational Social Scienceā which seeks to take a scientifically systematic approach to the application of tools like programming to social science problems and questions (cf. GonzĆ”lez-BailĆ³n, 2013; Lazer et al., 2009; Nelson, 2017). However, the approach Iām presenting here (programming as social science) slightly diverges from the scientifically systematic orientation of Computational Social Science for reasons explained in Chapter 2, so for practical purposes and ease of reference, this body of work will fall under the term āDigital Social Scienceā throughout.
However, the point of this book is not (just) to provide social scientists with an introduction to the mechanics of the Python programming language ā such introductions are already available across lots of books and websites (though few are tailored to the specific needs of social scientists as this book is). Nor is it to suggest that social scientists need to adopt programming as a way of āformalisingā, āmathematisingā and/or āautomatingā their work, as if such things were even possible ā see Section 1.2 for fuller details of the issues surrounding these ideas. Indeed, arguments against scientism (i.e. the idea that our work should operate more like the natural sciences of physics and chemistry) and the bureaucratisation of social science (i.e. the transformation of our work into a purely technical non-interpretive ānumber-crunchingā exercise) have been a defining characteristic of influential thinking about the role and purpose of social science, from Wittgenstein (2009 [1953]) and C. Wright Mills (2000 [1959]) to Button et al. (1995) and Savage and Burrows (2007). The strength of the social sciences has always been in their resistance to scientism and bureaucratisation, and in this sense, this book emphatically does not approach programming as a way of āupskillingā social science or reifying computer science/scientists as a gold standard to strive towards. We are already very skilled at what we do, and no amount of computing power or speed could possibly compete with our human capacity for critical thinking, for methodological reflexivity, for generating critiques and counter-narratives, for motivating social change and activism, and so on.
Rather, the point of this book is to show how the work of programming can fit into and enhance the skillsets and knowledges we already have, and in doing so bring about a uniquely social scientific approach to programming as a research method (weāll go on to call this āProgramming-as-Social-Scienceā, or PaSS for short) that we can leverage to do the work we want to do, in ways that we want to do it. The remainder of this book can be read as an elucidation of this central theme.
0.1 Who Is This Book for? Why This Book?
There are short answers to the above questions. Who is this book for? Itās probably no surprise that the intended audience of a book called Programming with Python for Social Scientists is social scientists who want to learn to program. In Python. And why this book? That can be boiled down to the following statements:
- Social science digital methods resources donāt typically cover programming. This makes it difficult to think about the practical aspects of doing Digital Social Science work, and it limits our capacity to be reflexive about our methods and methodologies when we use software tools developed by others.
- Python programming resources, on the other hand, donāt speak to the requirements of social science, and this makes it difficult to see how the general-purpose/abstract knowledge and skills transmitted through those resources can fit into the work we are trying to do.
- The value of this book is that it handles both social science and computer programming simultaneously ā it demonstrates Python as a social science research toolkit and walks through some examples of those tools in use in social-science-relevant tasks, but also shows you how to think about programming as a research method more widely.
It is, however, worth exploring a slightly longer answer to each of those questions, one that goes into who those people are who are social scientists in need of programming skills, and what they will get from this book that they canāt get anywhere else. The following sections will go into more detail about the kinds of work and thinking that programming can facilitate (and, by the same token, what kind of person would be interested in learning how to do those things). But for now, suffice to say that this book is for those people ā students and researchers in the social sciences ā looking to build skills with digital data and methods, in relation to both quantitative and qualitative research. The chapter headings of Part Three indicate the kinds of thing that ābuilding skills with digital data and methodsā constitutes; for instance, there are chapters on working with text files (i.e. data stored in .txt format), chapters on drawing data through social media platforms via Application Programming Interfaces (APIs), chapters on web scraping, chapters on vis...