Abstract
Transportation continues to play a strategic role in our worldwide economy, delivering goods and people through increasingly complex, interconnected, and multimodel transportation systems. Unfortunately, the complexities of modern transportation cannot be managed using yesterday’s tools. For example, the data collected by the technologies of the intelligent transportation systems (ITS) are increasingly complex and are characterized by heterogeneous formats, large volume, nuances in spatial and temporal processes, and frequent real-time processing requirements. Simple data processing, integration, and analytics tools do not meet the needs of complex ITS data processing tasks. The application of emerging data analytic systems and methods, with effective data collection and information distribution systems, provides opportunities which are required for building the ITSs of today and tomorrow.
Imagine a world in which all products always arrive in a predetermined order, on time, at low cost, with consistent results and are produced by a happy and productive transportation workforce. People travel in safe, comfortable, efficient systems that are affordable, convenient, and friendly to our environment. An educated transportation system workforce, including engineers, scientists, and operational professionals, has the tools to design, build, test, provision, operate and optimize the systems. It also has the knowledge to use these tools. This educated workforce is inherently multidisciplinary combining expertise from transportation engineering, software engineering, computer science, business, statistics, and mathematics.
ITS turn data into actionable knowledge enabling transportation users to make informed decisions ensuring the safe and efficient use of the facilities. For example, in such a system, every traveler has access to the most reliable and up-to-date status of almost all transportation modes from any point on the transportation network. Travelers use devices that include instrumented vehicles, smartphones, tablet computer, and roadside information displays. They can then choose the mode and route that will give them the minimum travel time and distance making dynamic adjustments from real-time information.
In this chapter, we will demonstrate that ITS is data-intensive application. First, we provide a summary of the sources and characteristics of ITS data, discussing the relationship of ITS to data analytics. Later, a review of the US National ITS architecture is given as an example framework for ITS planning, design, and deployment, with an overview of ITS applications and their relationships to data analytics. Finally, a brief history of ITS deployment around the world is given, and a future characterized by the technological advances in ITS is presented.
Keywords
Intelligent transportation systems; big data; ITS architecture; ITS applications; ITS history; data analytics; connected vehicle; ITS data collection technology
1.1 Intelligent Transportation Systems as Data-Intensive Applications
Intelligent transportation system (ITS) applications are complex, data-intensive applications with characteristics that can be described using the “5Vs of Big Data”: (1) volume, (2) variety, (3) velocity, (4) veracity, and (5) value (for the original 3V’s, see Ref. [1]). Note that any single one of these characteristics can produce challenges for traditional database management systems, and data with several of these characteristics are untenable for traditional data processing systems. Therefore, data infrastructures and systems that can handle large amounts of historic and real-time data are needed to transform ITS from a conventional technology-driven system to a complex data-driven system.
The first “V” is the volume of ITS data, which is growing exponentially for transportation systems. With the growing number of complex data collection technologies, unprecedented amounts of transportation related data are being generated every second. For example, approximately 480 TB of data was collected by every automotive manufacturer in 2013, which is expected to increase to 11.1 PB/year by 2020 [2]. Similarly, 500 cameras of the closed-circuit television (CCTV) system in the city of London generate 1.2 Gbps [3].
The second “V” of ITS data is the variety of the data, which are collected in various formats and in a number of ways, including numeric data captured from sensors on both vehicles and infrastructure, text data from social media, and image and GIS data loaded from maps. The degree of the organization of this data can vary from semi-structured data (e.g., repair logs, images, videos, and audio files) to structured data (e.g., data from sensor systems and data from within a traffic incident data warehouses) [4]. Social media data is considered to be semi-structured data, containing tags or a common structure with distinct semantic elements. Different datasets have different formats that vary in file size, record length, and encoding schemes, the contents of which can be homogeneous or heterogeneous (i.e., with many data types such as text, discrete numeric data, and continuous numeric data that may or may not be tagged). These heterogeneous data sets, generated by different sources in different formats, impose significant challenges for the ingestion and integration of a data analytics system. However, their fusion enables sophisticated analyses from self-learning algorithms for pattern detection to dimension reduction approaches for complex predictions.
The third “V” of ITS data, velocity, varies widely. Data ingest rates and processing requirements vary greatly from batch processing to real-time event processing of online data feeds, inducing high requirements on data infrastructure. Some data are collected continuously, in real-time, whereas other data are collected at regular intervals. For example, most state Departments of Transportation (DOTs) use automated data collectors that feed media outlets with data. One such example is the Commercial/Media Wholesale Web Portal (CWWP) designed by the California DOT (Caltrans) to facilitate the data needs of commercial and media information service providers. The CWWP requests and receives traveler information generated by the data collection devices maintained by Caltrans [5]. Although speed data from traffic is collected continuously, data such as road maps may be updated at less frequent intervals.
The term veracity is the fourth “V” of ITS data and is used to describe the certainty or trustworthiness of ITS data. For example, any decision made from a data stream is predicated upon the integrity of the source and the data stream, that is, the correct calibration of s...