Beginning Apache Spark 3
eBook - ePub

Beginning Apache Spark 3

With DataFrame, Spark SQL, Structured Streaming, and Spark Machine Learning Library

  1. English
  2. ePUB (mobile friendly)
  3. Only available on web
eBook - ePub

Beginning Apache Spark 3

With DataFrame, Spark SQL, Structured Streaming, and Spark Machine Learning Library

Book details
Table of contents
Citations

About This Book

Take a journey toward discovering, learning, and using Apache Spark 3.0. In this book, you will gain expertise on the powerful and efficient distributed data processing engine inside of Apache Spark; its user-friendly, comprehensive, and flexible programming model for processing data in batch and streaming; and the scalable machine learning algorithms and practical utilities to build machine learning applications.

Beginning Apache Spark 3 begins by explaining different ways of interacting with Apache Spark, such as Spark Concepts and Architecture, and Spark Unified Stack. Next, it offers an overview of Spark SQL before moving on to its advanced features. It covers tips and techniques for dealing with performance issues, followed by an overview of the structured streaming processing engine. It concludes with a demonstration of how to develop machine learning applications using Spark MLlib and how to manage the machine learning development lifecycle.This book is packed with practical examples and code snippets to help you master concepts and features immediately after they are covered in each section.

After reading this book, you will have the knowledge required to build your own big data pipelines, applications, and machine learning applications.

What You Will Learn

  • Master the Spark unified data analytics engine and its various components
  • Work in tandem to provide a scalable, fault tolerant and performant data processing engine
  • Leverage the user-friendly and flexible programming model to perform simple to complex data analytics using dataframe and Spark SQL
  • Develop machine learning applications using Spark MLlib
  • Manage the machine learning development lifecycle using MLflow

Who This Book Is For

Data scientists, data engineers and software developers.

Frequently asked questions

Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes, you can access Beginning Apache Spark 3 by Hien Luu in PDF and/or ePUB format, as well as other popular books in Computer Science & Databases. We have over one million books available in our catalogue for you to explore.

Information

Publisher
Apress
Year
2021
ISBN
9781484273838
Edition
2

Table of contents

  1. Cover
  2. Front Matter
  3. 1. Introduction to Apache Spark
  4. 2. Working with Apache Spark
  5. 3. Spark SQL: Foundation
  6. 4. Spark SQL: Advanced
  7. 5. Optimizing Spark Applications
  8. 6. Spark Streaming
  9. 7. Advanced Spark Streaming
  10. 8. Machine Learning with Spark
  11. 9. Managing the Machine Learning Life Cycle
  12. Back Matter