1.1 Big Data and Technological Discourse
Big data has of late been attracting great attention in business, government and science, from where it is propagating into wider societal circles that respond to new forms of knowledge and contribute to its dissemination. Aiming at a deeper understanding of what big data means and of its impact in society, this work intends to prioritise a linguistic and discursive approach to the big data debate. That phrases such as artificial intelligence, the internet of things and, lastly, big data should be coming to the fore in discourse can be ascribed to the hype cycle of technological expectations and the widespread adoption of computational and digital practices in contemporary society. 1 However, the conceptualisation and application of innovative socio-technical constructs in key societal domains like science, medicine, law, business, politics and government policy are still under-researched.
This imperfect understanding reverberates in the polarised coverage big data receives in the news media and in public and policy discourse. Scientific and technical innovation is hailed as âa major technological revolution that will reshape manufacturing industries and social and economic life more broadlyâ (Reischauer 2018, p. 26) but, at the same time, the advances brought about by big data tend to be associated with feelings of unease when loss of control and potential threats to human rights are exposed. Examples of this include data aggregation and access control of data repositories on the part of governments and corporate ventures in which the surveillance trends of big data are intensified (Lyon 2014), from the Snowden revelations of the US intelligenceâs phone surveillance program in June 2013 2 to the Cambridge Analytica data leak in March 2018. 3 Against this background which is still shaping itself, this research work sets out to offer an overview of the big data debate and a fresh look into the interweaving of discursive perspectives coming from different social actors and fields which are not limited to the technological domain and its experts. In sum, the aim is âto dismantle claims about the given and irrevocable facticity of data formats and data analytics so as to explore ways of reimagining their status and implicationsâ (Pentzold and Fischer 2017, p. 2).
Searching the etymology of âbig dataâ is an excellent starting point to investigate the potential of discourse and storytelling in the dissemination of technological innovation among lay publics and their contribution to the making of a contemporary technological imaginary. While data, as the plural of the Latin loan word datum, is a countable noun, âbig data is grammatically a mass noun, a conceptual shift from single units of information to a homogeneous aggregateâ (Puschmann and Burgess 2014, p. 1694). The shift explains why the noun phrase is often capitalised (âBig Dataâ), 4 as it is perceived as a collective entity that is conceived as a proper noun. The big data that mesmerises the public focus on science today emerged in common parlance in the early 1990s, but began âgaining wider legitimacy only around 2008â (Boellstorff 2013, n.p.).
Retracing the origin of the phrase has been compared to an âetymological detective storyâ (Lohr 2013, n.p.). One claim is that big data âwas first used by John Mashey, retired former Chief Scientist at Silicon Graphics, to refer to handling and analysis of massive datasetsâ (Diebold 2012, quoted in Kitchin 2016, p. 1) around the mid-1990s. By contrast, there are others who believe that â[l]ikely the first use of âbig dataâ to describe a coherent problem was in a publication by Michael Cox and David Ellsworth in 1997 attributing the term to the challenge of visualizing large datasetsâ (Metcalf et al. 2016, p. 4). In any case, since the last decade of the twentieth century, big data has quickly turned into a catchphrase for tech industries, with IBM as the first major company that has capitalised on it (Zikopolous et al. 2012), constructing a very optimistic social reality around big data through discourse. 5 In November 2011, the American Popular Science magazine published a special issue entitled âData Is Powerâ, highlighting the empowering effects of data.
Despite several attempts to categorise the identifying features of big data, 6 , 7 a stable definition has still not been reached in the literature and âthe lack of a systematic meta-discourse surrounding the polysemy of Big Dataâ (Portmess and Tower 2015, p. 4) is apparent in the debate. Nonetheless, a broad consensus exists on the fact that the âbignessâ of data is not just a matter of volume but of mindset (boyd and Crawford 2012; Kitchin 2014a, b; Shin and Choi 2015), as it leads to the break-up of existing research approaches and âthe capacity to search, aggregate, and cross-reference large data setsâ (boyd and Crawford 2012, p. 663). It is argued, quite convincingly, that it is the complexity of the task, rather than data magnitude, that qualifies big data: âBig Data discursively refers to a qualitative shift in the meaning of data, in not just the amount of data (approaching exhaustiveness) but also its quality (approaching a dynamic, fine-grained relational richness)â (Chandler 2015, p. 836).
In actual fact, though somewhat approximately in light of big dataâs defining features (Kitchin, supra note 6), a variety of data sets are commonly included under this denomination: government registries and databases, national surveys and censuses, online multimodal repositories, corporate transactional data, health data collections and biobanks, user-generated content on social media, web indexes and directories, measurements from sensor-embedded environments, surveillance and communication metadata tracing digital footprints and so on. The network of internet-connected technological contrivances capable of collecting and exchanging data is referred to as the internet of things. The label describes a variety of electronic devices, appliances, sensor-laden objects, vehicles, smart watches, wristbands, garments and wearables, self-tracking technologies and apps that are able to interact with the real-world environment, in quite an unprecedented and revolutionary way (Paganoni 2017). 8 âDataficationâ (Lycett 2013; van Dijck 2014; Chandler...