Big Data, Open Data and Data Development
eBook - ePub

Big Data, Open Data and Data Development

  1. English
  2. ePUB (mobile friendly)
  3. Available on iOS & Android
eBook - ePub

Big Data, Open Data and Data Development

Book details
Book preview
Table of contents
Citations

About This Book

The world has become digital and technological advances have multiplied circuits with access to data, their processing and their diffusion. New technologies have now reached a certain maturity. Data are available to everyone, anywhere on the planet. The number of Internet users in 2014 was 2.9 billion or 41% of the world population. The need for knowledge is becoming apparent in order to understand this multitude of data. We must educate, inform and train the masses. The development of related technologies, such as the advent of the Internet, social networks, "cloud-computing" (digital factories), has increased the available volumes of data. Currently, each individual creates, consumes, uses digital information: more than 3.4 million e-mails are sent worldwide every second, or 107, 000 billion annually with 14, 600 e-mails per year per person, but more than 70% are spam. Billions of pieces of content are shared on social networks such as Facebook, more than 2.46 million every minute. We spend more than 4.8 hours a day on the Internet using a computer, and 2.1 hours using a mobile. Data, this new ethereal manna from heaven, is produced in real time. It comes in a continuous stream from a multitude of sources which are generally heterogeneous.

This accumulation of data of all types (audio, video, files, photos, etc.) generates new activities, the aim of which is to analyze this enormous mass of information. It is then necessary to adapt and try new approaches, new methods, new knowledge and new ways of working, resulting in new properties and new challenges since SEO logic must be created and implemented. At company level, this mass of data is difficult to manage. Its interpretation is primarily a challenge. This impacts those who are there to "manipulate" the mass and requires a specific infrastructure for creation, storage, processing, analysis and recovery. The biggest challenge lies in "the valuing of data" available in quantity, diversity and access speed.

Frequently asked questions

Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes, you can access Big Data, Open Data and Data Development by Jean-Louis Monino, Soraya Sedkaoui in PDF and/or ePUB format, as well as other popular books in Tecnología e ingeniería & Recursos energéticos. We have over one million books available in our catalogue for you to explore.

Information

Publisher
Wiley-ISTE
Year
2016
ISBN
9781119285212

1
The Big Data Revolution

The amount of data generated by people, Internet-connected devices and companies is growing at an exponential rate. Financial institutions, companies and health service providers generate large quantities of data through their interactions with suppliers, patients, customers and employees. Beyond those interactions, large volumes of data are created through Internet searches, social networks, GPS systems and stock market transactions. This widespread production of data has resulted in the “data revolution” or the Age of Big Data.
The term “Big Data” is used to describe a universe of very large sets of data composed of a variety of elements. This gives way to a new generation of information technology designed to make available the increased processing speeds necessary to analyze and extract value from large sets of data, employing – of course – specialized materials and software. The phenomenon of Big Data not only refers to the explosion in the volume of data produced, which was made possible by the development of information storage and dissemination capacities on all sorts of platforms, but the term also refers to a second phenomenon, which involves newfound data processing capabilities.
In general terms, the concept of Big Data describes the current state of affairs in the world, in which there is a constant question of how to manage lumps of data in a better way, and how to make sense of the massive volume of data produced daily.
Data sources are multiplying: smartphones, tablets, social networks, web services and so on. Once these intelligent objects are connected to the Internet, they can feed data into enormous databases and communicate with other objects and humans [PRI 02]. This data must be processed and developed in order to become “intelligent” or “smart”. Intelligence, which can be brought out by using analysis techniques, can provide essential information that top management will require in order to determine strategies, boost operational performance and manage risks.
To this end, “data scientists” must pool their strengths in order to face the challenges of analyzing and processing large pools of data, gaining clarity and precision. Data scientists must make data “speak” by using statistical techniques and specialized software designed to organize, synthesize and translate the information that companies need to facilitate their individual decision-making processes.

1.1. Understanding the Big Data universe

The IT craze that has swept through our society has reached a new level of maturity. When we analyze this tendency, we cannot help being overwhelmed by the transformations that it has produced across all sectors. This massive wave developed very quickly and has resulted in new applications. Information and communication technologies (ICTs) and the advent of the Internet have triggered an explosion in the flow of information (Big Data). The world has become digital, and technological advances have multiplied points of access to data.
But, what exactly is Big Data? The concept really took off with the publication of three important reports from the McKinsey Institute:
  • – Clouds, Big Data, and Smart Assets: Ten Tech-Enabled Business Trends to Watch [BUG 10];
  • – Are You Ready for the Era of “Big Data”? [BRO 11];
  • – Big Data: The Next Frontier for Innovation, Competition and Productivity [MAN 11].
“Big Data” describes: “a series of data, types of data, and tools to respond quickly to the growing amount of data that companies process throughout the world1”. The amount of data gathered, stored and processed by a wide range of companies has increased exponentially. This has partially benefited from an explosion in the amount of data resulting from web transactions, social media and bots.
The growth of available data in terms of quantity, diversity, access speed and value has been enormous, giving way to the “four Vs”: “Volume”, “Variety”, “Velocity” and “Value”2, that are used to define the term Big Data:
  • – Volume: the advent of the Internet, with the wave of transformations in social media it has produced; data from device sensors; and an explosion of e-commerce all mean that industries are inundated with data that can be extremely valuable. All these new devices produce more and more data, and in turns, enrich the volume of existing data;
  • – Variety: with the rise of Internet and Wi-Fi networks, smartphones, connected objects and social networks, more and more diverse data is produced. This data comes from different sources and varies in nature (SMSs, Tweets, social networks, messaging platforms, etc.);
  • – Velocity: the speed at which data is produced, made available, and interpreted in real-time. The possibility of processing data in real-time represents a field of particular interest, since it allows companies to obtain results like personalized advertisements on websites, considering our purchase history, etc.;
  • – Value: the objective of companies is to benefit from data, especially by making sense out of it.
The challenges of Big Data are related to the volume of data, its variety, the speed at which it is processed, and its value. Some scholars add another three “Vs”, namely3: “Variability”, “Veracity”, and “Visualization”.
The first V refers to data whose meaning evolves constantly. The second qualifies the result of the data’s use, since even though there is a general consensus about the potential value of Big Data, data has almost no value at all if it is not accurate. This, particularly, is the case for programs that involve automatic decision-making, or for data feeding into unmonitored machine learning algorithms. The last V, which touches on one of the greatest challenges of Big Data, has to do with the way in which the results of data processing (information) are presented in order to ensure superior clarity.
The expression “Big Data” represents a market in and of itself. Gilles Grapinet, deputy CEO of Atos notes that “with Big Data, organizations’ data has become a strategic asset. A giant source of unexpected resources has been discovered.” This enormous quantity of data is a valuable asset in our information society.
Big Data, therefore, represents a large discipline that is not limited to the technological aspect of things. During recent years, the concept has sparked growing interest from actors in the information management systems sector. The concept of the “four Vs” or even that of the “seven Vs” opens up new avenues for consideration and research, but they do not provide a clear definition of the phenomenon. The sum of these “Vs” gives way to new perspectives for new product creation through improved risk management and enhanced client targeting. Actions aimed at anticipating and reducing subscription suspensions or at making customers more loyal can also be envisioned.
The increase in the volume of data, processing speed and data diversity all present new challenges to companies and affect their decisi...

Table of contents

  1. Cover
  2. Table of Contents
  3. Title
  4. Copyright
  5. Acknowledgements
  6. Foreword
  7. Key Concepts
  8. Introduction
  9. 1 The Big Data Revolution
  10. 2 Open Data: A New Challenge
  11. 3 Data Development Mechanisms
  12. 4 Creating Value from Data Processing
  13. Conclusion
  14. Bibliography
  15. Index
  16. End User License Agreement