CHAPTER 1
A Data-Driven Approach to Urban Science and Policy
IN 1996, BALTIMORE, MARYLAND, introduced the first 311 hotline.1 It arrived with little fanfare or anticipation of its future influence. Rather, the goal was to solve a relatively mundane practical issue. Inner-city Baltimore was suffering from high levels of crime and blight, and the city was receiving enough reports for shooting and other serious events that calls about ânuisances,â such as graffiti, abandoned buildings, and other issues of deterioration, were themselves seen as a nuisance. The 311 hotline was thus born of a need to triage 911 calls that did not qualify as emergencies.
It was not until a decade later that the advent of digital technology made apparent an additional advantage of 311: Equipped with the information from resident reports, operations departments could generate automated work-order queues that guided the daily deployment of resources. This enhanced the value of 311 systems for major metropolises, but it also raised the possibility that they could make government services more effective and efficient for municipalities of any size. As a result, 311 hotlines and allied programs are now in place in over 400 American municipalities in 40 states and counting, spanning the geographic and demographic range of the country.2 Since then, 311 systems have become a de facto symbol for the field of urban informatics. They have inspired blog posts and magazine articles, including the widely distributed Wired essay âWhat 100 Million Calls for Service Can Tell Us about New York City.â3 Publications focused on governance have either trumpeted the benefits of 311 outright or coyly posed questions such as, âIs the cost of 311 systems worth the price of knowing?â (coming to the eventual conclusion that, âYes, they are.â).4 They have stimulated research projects, including our flagship project at the Boston Area Research Initiative (BARI), which forms the main content of this book and has given rise to methodological and philosophical approaches that guide much of our other work.
The 311 systems have proliferated quickly, but, given that there are plenty of other technological innovations in cities that merit attention, why have they become so emblematic of urban informatics? I would argue that it is because, in addition to their widespread popularity, 311 systems embody each of five major themes whose convergence characterizes the field. The first two themes form the bases of the field: (1) the innovative use of novel data resources and (2) the utilization of crowdsourcing and sensor technologies that provide a detailed view of patterns and conditions across the city. Their value has been amplified by (3) widespread data sharing, or, in its most extreme case, âopen data.â This has been a critical mechanism for supporting a civic data ecosystem in which individuals and institutions from a range of disciplines and sectors can pursue and collaborate on questions of common interest. Finally, these collaborations have been channeled into two main, and often complementary, products of the field, which constitute the fourth and fifth themes: (4) technocratic policy innovations that improve the efficiency and effectiveness of city services, and (5) the scientific pursuit of a deeper understanding of the city and its people, places, and systems. Importantly, this view highlights two lessons that are often overlooked, especially by âsmart citiesâ narratives. First, the products of modern data and technology need not be immensely expensive or flashy to be both informative and useful. Second, cross-sector collaborations are critical in generating these products.
Part II of this book presents an overview of the field of urban informatics, using 311 to illustrate how modern digital data can catalyze cross-sector research that generates both new knowledge and public value. This first chapter articulates and details the five main themes of the field, describing how 311 reflects each. In addition, because urban informatics is a young field and thus still relatively small, it is possible for me to summarize in this chapter many of the primary research programs that compose it. I do not provide a stand-alone list of these programs but instead describe various examples throughout the chapter in order to capture the five themes in action while also giving the reader a sense of the range of models for this work.5 I will go into some depth on BARI, discussing why 311 has acted as the jumping-off point for us. Whereas the current chapter emphasizes the inspiration and potential of the field, Chapter 2 will follow with a more critical assessment of how one properly conducts research with the novel digital data resources that form the bases of urban informatics, again using 311 to demonstrate both the challenges and the possibilities.
The Bases of Urban Informatics
New Technology, New Data
At its foundation, urban informatics has emerged from recent advances in digital data and technology. These resources have generated new information that I divide into two forms for the purpose of presentation: enhanced forms of old information, and novel information produced by new technologies. In the first, the digitization of many administrative processes that previously existed only on paper has given rise to numerous data sets that capture the patterns of the city in intricate detail. This is occurring across the public, private, and nonprofit sectors, with examples ranging from credit card purchases, to rides on public transit, to entries to community centers; the tracking of energy and water usage to yearly vehicle inspections; the marriage registry, business licenses, building permits, tax assessments, and restaurant inspections; and, of course, requests for public services through 311 systems. This list is far from exhaustive, but it gives a sense of the diverse range of data generated by the individuals and institutions of the city. All told, their digitization makes newly accessible a wealth of information on the behaviors, movements, social interactions, commerce and industry, and physical and environmental conditions of the city.
As digitization increases the potential utility of administrative records, two other technologies are generating entirely new kinds of information. The first of these technologies builds off of social media and other internet sites and applications that gather user-generated content, also known as Web 2.0. The content shared with these platformsâYelp! reviews, Picasa pictures, YouTube videos, exercise and sleep activity from FitBit bracelets, âtweetsâ through Twitterâare data that one might organize, map, analyze, and interpret. A subset of Web 2.0 applications also supports direct communication between a client and a service provider, be it private or public, capturing every transaction as a data record. This capacity has taken hold in 311 as well. Boston introduced Citizens Connect (now BOS:311) in 2008 as an early effort to introduce a smartphone app for a municipal 311 system, leveraging the internet and smartphones as an additional channel for constituents to request government services. Other cities have since followed suit.
The second technological advance of note is the proliferation of sensors. Some examples include GPS trackers for geographic mobility patterns; accelerometers that detect different types of physical movement; and sensors that record the density of pollutants in air and water, ambient temperature, light intensity, precipitation, noise levels, or physical vibrations. Some âsensorsâ we might not even think of as such. For example, wi-fi hot spots can be used to estimate pedestrian traffic by counting the number of devices that engage them. New image-processing programs translate footage from security cameras into estimates of pedestrian, bicycle, car, and truck volume through a space. Many cities have also deployed âshot spottersâ that detect the sound profile of gunshots. These are just a few examples, but they serve to illustrate the broad potential of sensor technologies.
A Composite View of the City
The knowledge derived from modern administrative data, sensor technologies, and Web 2.0 applications evokes an approach to measurement that combines many narrow observations to build a comprehensive view of the world. This is not an entirely novel conceptâfor example, Sampson, Raudenbush, and Earls developed a methodology in which they surveyed thousands of Chicagoans about their neighborhood to create robust measures of physical and social conditions across the cityâbut its scale and generality in urban informatics is distinctive.6 In the case of sensor technologies, a city or research center might deploy a set of units that track local conditions in real time. Each observes only a small slice of the world, but their composite provides detailed coverage across space. For example, the University of Chicagoâs Urban Center for Computation and Data, which I will discuss further in the next section, is deploying a system of sensors called the Array of Things in Chicago. The sensors track localized environmental and atmospheric conditions and activity, and the overall system is billed as a âfitness tracker for the city.â7
In Web 2.0, human users provide the individual pieces of information. This process is referred to as crowdsourcing, a term that has entered common parlance through efforts such as Wikipedia, where the many members of the âcrowdâ collectively contribute to knowledge. At the intersection of crowdsourcing and sensor technologies is citizen sensing, in which members of the public are either an active or a passive vehicle for observing and recording events and conditions. At the most passive, cell phone records register the location and activity of a user every time the user engages with a cell tower. On the other end of this spectrum, in one project bus drivers voluntarily carried GPS trackers in an effort to identify the unofficial âroutesâ of Nairobiâs informal transit system.8
Administrative records offer a third way to gain a composite view of the city. At times, these may be classified as citizen sensing, as the information is provided by constituents submitting forms or requests. For example, one might argue that 311 encourages residents to act as âthe eyes and ears of the city.â In turn, it crowdsources a constantly updating map of the potholes, streetlight outages, and downed trees that need attention. Other administrative data, such as tax assessments, are generated through internally directed processes. Whether they arise from citizen sensing or not, administrative processes, just like sensors, generate thousands or even millions of records, each describing a discrete event or condition at a specific place and time. In turn, their corpus can be aggregated to describe localized patterns.
The intertwined trends of (1) the emergence of novel data resources and (2) c...