Data Science Solutions with Python
eBook - ePub

Data Science Solutions with Python

Fast and Scalable Models Using Keras, PySpark MLlib, H2O, XGBoost, and Scikit-Learn

  1. English
  2. ePUB (mobile friendly)
  3. Available on iOS & Android
eBook - ePub

Data Science Solutions with Python

Fast and Scalable Models Using Keras, PySpark MLlib, H2O, XGBoost, and Scikit-Learn

Book details
Table of contents
Citations

About This Book

Apply supervised and unsupervised learning to solve practical and real-world big data problems. This book teaches you how to engineer features, optimize hyperparameters, train and test models, develop pipelines, and automate the machine learning (ML) process.
The book covers an in-memory, distributed cluster computing framework known as PySpark, machine learning framework platforms known as scikit-learn, PySpark MLlib, H2O, and XGBoost, and a deep learning (DL) framework known as Keras.

The book starts off presenting supervised and unsupervised ML and DL models, and then it examines big data frameworks along with ML and DL frameworks. Author Tshepo Chris Nokeri considers a parametric model known as the Generalized Linear Model and a survival regression model known as the Cox Proportional Hazards model along with Accelerated Failure Time (AFT). Also presented is a binary classification model (logistic regression) and an ensemble model (Gradient Boosted Trees). The book introduces DL and an artificial neural network known as the Multilayer Perceptron (MLP) classifier. A way of performing cluster analysis using the K-Means model is covered. Dimension reduction techniques such as Principal Components Analysis and Linear Discriminant Analysis are explored. And automated machine learning is unpacked.

This book is for intermediate-level data scientists and machine learning engineers who want to learn how to apply key big data frameworks and ML and DL frameworks. You will need prior knowledge of the basics of statistics, Python programming, probability theories, and predictive analytics.

What You Will Learn

  • Understand widespread supervised and unsupervised learning, including key dimension reduction techniques
  • Know the big data analytics layers such as data visualization, advanced statistics, predictive analytics, machine learning, and deep learning
  • Integrate big data frameworks with a hybrid of machine learning frameworks and deep learning frameworks
  • Design, build, test, and validate skilled machine models and deep learning models
  • Optimize model performance using data transformation, regularization, outlier remedying, hyperparameter optimization, and data split ratio alteration

Who This Book Is For
Data scientists and machine learning engineers with basic knowledge and understanding of Python programming, probability theories, and predictive analytics

Frequently asked questions

Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes, you can access Data Science Solutions with Python by Tshepo Chris Nokeri in PDF and/or ePUB format, as well as other popular books in Computer Science & Databases. We have over one million books available in our catalogue for you to explore.

Information

Publisher
Apress
Year
2021
ISBN
9781484277621

Table of contents

  1. Cover
  2. Front Matter
  3. 1. Exploring Machine Learning
  4. 2. Big Data, Machine Learning, and Deep Learning Frameworks
  5. 3. Linear Modeling with Scikit-Learn, PySpark, and H2O
  6. 4. Survival Analysis withPySpark and Lifelines
  7. 5. Nonlinear Modeling With Scikit-Learn, PySpark, and H2O
  8. 6. Tree Modeling and Gradient Boosting with Scikit-Learn, XGBoost, PySpark, and H2O
  9. 7. Neural Networks with Scikit-Learn, Keras, and H2O
  10. 8. Cluster Analysis with Scikit-Learn, PySpark, and H2O
  11. 9. Principal Component Analysis with Scikit-Learn, PySpark, and H2O
  12. 10. Automating the Machine Learning Process with H2O
  13. Back Matter