eBook - ePub

Pragmatic Machine Learning with Python

Name: Pragmatic Machine Learning with Python
Author: Avishek Nag

Learn How to Deploy Machine Learning Models in Production

Avishek Nag,

English
ePUB (mobile friendly)
Available on iOS & Android

eBook - ePub

Pragmatic Machine Learning with Python

Learn How to Deploy Machine Learning Models in Production

Avishek Nag,

Book details

Book preview

Table of contents

Citations

About This Book

An easy-to-understand guide to learn practical Machine Learning techniques with Mathematical foundations Key Features

A balanced combination of underlying mathematical theories & practical examples with Python code
Coverage of latest topics like multi-label classification, Text Mining, Doc2Vec, Word2Vec, XMeans clustering, unsupervised outlier detection, techniques to deploy ML models in production-grade systems with PMML, etc
Coverage of sufficient & relevant visualization techniques specific to any topic
Description
This book will be ideal for working professionals who want to learn Machine Learning from scratch. The first chapter will be an introductory chapter to make readers comfortable with the idea of Machine Learning and the required mathematical theories. There will be a balanced combination of underlying mathematical theories corresponding to any Machine Learning topic and its implementation using Python. Most of the implementations will be based on 'scikit-learn, ' but other Python libraries like 'Gensim' or 'PyTorch' will also be used for some topics like text analytics or deep learning. The book will be divided into chapters based on primary Machine Learning topics like Classification, Regression, Clustering, Deep Learning, Text Mining, etc. The book will also explain different techniques of putting Machine Learning models into production-grade systems using Big Data or Non-Big Data flavors and standards for exporting models. What you will learn
Get familiar with practical concepts of Machine Learning from ground zero
Learn how to deploy Machine Learning models in production
Understand how to do "Data Science Storytelling"
Explore the latest topics in the current industry about Machine Learning
Who this book is for
This book would be ideal for experienced Software Professionals who are trying to get into the field of Machine Learning. Anyone who wishes to Learn Machine Learning concepts and models in the production lifecycle. Table of Contents
1. Introduction to Machine Learning & Mathematical preliminaries
2. Classification
3. Regression
4. Clustering
5. Deep Learning & Neural Networks
6. Miscellaneous Unsupervised Learning
7. Text Mining
8. Machine Learning models in production
9. Case Studies & Data Science Storytelling About the Author
Avishek has a Master's degree in Data Analytics & Machine Learning from BITS (Pilani) and a Bachelor's degree in Computer Science from West Bengal University of Technology (WBUT). He has more than 14 years of experience in different renowned companies like VMware, Cognizant, Cisco, Mobile Iron, etc. He started his career as a Java developer and later moved to the core area of Machine Learning around five years back. He has practical experience in the design & development of Machine Learning systems, starting from inception to production in multiple organizations. Strong foundations in Mathematics/Statistics and a solid experience in product development had helped him to excel quickly in the world of ML & Data Science. He has shared his knowledge & experience through this book, which can help any Software Engineer to kick start in this area. He also writes blogs, and the same can be found at https://medium.com/@avisheknag17 Your Blog links: https://medium.com/@avisheknag17 Your LinkedIn Profile: https://www.linkedin.com/in/avishek-nag-957a0015/

Frequently asked questions

Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.

At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.

Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.

We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.

Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.

Yes, you can access Pragmatic Machine Learning with Python by Avishek Nag in PDF and/or ePUB format, as well as other popular books in Computer Science & Data Processing. We have over one million books available in our catalogue for you to explore.

Information

Publisher

BPB Publications

Year

2020

ISBN

9789389845365

Topic

Computer Science

Subtopic

Data Processing

Index

Computer Science

CHAPTER 1 Introduction to Machine Learning and Mathematical Preliminaries

Over the recent years, machine learning (ML) is the most discussed topic in computational/software industry. People andcompanies are running behind it. Just by reading the term machine learning, the first question comes in our mind, that: Can a brainless machine learn? Moreover, who will give the knowledge for learning? Learning itself is an exciting and complex action. When a teacher teaches a group of kids in school about how to identify an animal's picture by showing them many samples, it becomes a typical learning process. Kids learn the animal's name and their corresponding pictures.

In the same way, Can a machine or a computer learn to do the same task? Yes, it can. The subject where these techniques of learning are discussed is known as machine learning. Humans have a brain, so they learn by intuition without explicitly thinking about the computational or mathematical background behind a learning process. But, in the case of a computer/machine/software program, there is no explicit existence of the brain, so there has to be an explicit existence of the mathematical learning process to compensate this as computers only understands numbers.

Structure

In this chapter, we will discuss

Objectives of machine learning
Lifecycle of machine learning as software projects
Formal definitions of different machine learning techniques
Mathematical preliminaries required to understand machine learning in-depth

Objective

After reading this chapter, we should be able to:

Understand machine learning at a high level from a mathematical perspective.
Understand practical execution cycles of machine learning projects at a high level.
Understand what do we mean by a machine learning model.
Differentiate between different techniques of machine learning and types of problems.
Some mathematical theories in the light of machine learning.

Purpose of machine learning

We understood from the previous section that learning is possible by a machine. But what is the need for it? What benefit can we get by making a machine learn? The benefit is nothing but the automation of human performed tasks or repetitive tasks. For example, a computer can read and understand somebody’s facial gestures and use it as a key to open a door for him/her. So, manual innervation is reduced over here.

Similarly, an ecommerce portal can learn a buyer’s purchasing pattern and generate recommended items for him/her to purchase. With ML, all of these are possible. Another example could be the validation of a loan borrower from a commercial bank. With ML, it can be checked whether the customer who is borrowing the loan would be a potential defaulter (not able to repay the loan) or not. In general, with ML, a variety of works is possible to do, starting from identification of a picture to recommendation of items and many more things.

What is a machine learning model?

A machine learning model is a mathematical expression/equation or a complex data structure from the theory of computer science or a combination of both. It is an intersection between statistics, core computer science, and software engineering. A model can learn from the actions of humans or nature and can simulate future behavior for some unknown situation. In simple terms, a model can predict future things that can happen. We will be using the term model and machine learning model interchangeably throughout our discussion.

Models can learn from the history of actions, as said above. These actions are stored as records in the database. So, having a dataset is essential for building a model and using it.

What is a dataset?

From the concept of DBMS (Database Management Systems), we can say that a dataset is a collection of records. Each dataset consists of several rows and columns. Each column represents several different aspects of a dataset as defined in DBMS. A simple dataset is precisely like a table in relational DBMS. But, sometimes, the dataset can be complicated, that is, hierarchical or can contain other datasets within it. This situation is precisely like NoSQL DB. A straightforward dataset of an employee is shown below:

Figure 1.1: Sample employee dataset

There are five columns (branch, department, designation, id, name, type) and four rows in the dataset (There can be more rows in the dataset. We are just considering four rows for our discussion purpose). An ML model can learn from this dataset and later can give predictions. Having a dataset is evident for a model to be ready and work successfully.

What are the variables and features?

We will be using terms variables and features many times throughout our discussion. In the above dataset, each column itself is a variable. From a database perspective, a column can have multiple values to variable.

Predictor and target variables

For example, if we have to build a machine learning model that can predict the salary of an employee, then salary becomes the target variable, and all other becomes the predictor variable. A machine learning model analyzes predictor variable values and tries to construct a mathematical form/data structure that can predict the values of the target variable.

Predictor variables are often called features. It is not evident that from a dataset, we will take all the features for building a model. There are techniques to choose the appropriate features that we will discuss later.

Types of variables: Continuous and categorical

Majorly, there are two types of variables: continuous andcategorical.

Continuous: Continuous variables can have any values within a specific range or without a range. For example, any real number with decimal places or integer values. In the above dataset, salary is a continuous variable.
Categorical: Categorical variables are generally of the string data type. But these can have only a fixed number of different values. For example, the department in the above dataset. It can have values: Accounts, Marketing, and Design only. These values are real strings. Though string can have characters inside it, we don’t consider individual characters as separate values for a specific data type. Instead we consider the entire string as a whole. The number of distinct values for a categorical variable is called the cardinality of that. The cardinality of the department is 3 in the above dataset.

Apart from these two, there is another type of variable like Binary, Date, and more. Binary variables are a special type of categorical variable having cardinality 2 (True and False or else 0 and 1 as combinations). The date type variable can be decomposed into categorical and continuous variables.

Categorical variables don’t need to always be of string type. Numbers can also be treated as categorical ones. There is a special logic of determining which variable is categorical and which one is continuous. We need to measure a ratio as given below:

We can set a threshold for this ratio to decide on considering a variable as continuous or categorical. A higher value of this ratio indicates the possibility of a variable to be considered continuous; otherwise, it should be categorical.

Lifecycle of a machine learning model

Like any software development project, ML model development projects also have a typical lifecycle. But in some areas, it differs a lot with traditional application/product development. The main reason is the research-oriented nature of work. A typical model goes through several iterations before putting it in a production environment. Generally, ML comes under a broader practice of subject called data science. Other than ML, data science involves data discovery, exploration, and normal descriptive analytics. All model development-related research activities come under data science and anything else which helps in rolling out the model to production, scaling, and more, come under data engineering activities.

There are four major stages in a typical ML model development, as shown in the below cyclic diagram:

Figure 1.2: Life Cycle of Machine Learning Project

Data exploration, analysis, and research: In this stage, ML experts/data scientists get a first taste of the data. Various kinds of slicing, dicing, and visualization are done at this stage by using a proper sample of the total dataset. Trial and error with several models also are done here. In the end, one particular model is chosen to proceed.
Model trainin...

Cover Page
Title Page
Copyright Page
Dedication
About the Author
Acknowledgement
Preface
Errata
Table of Contents
1. Introduction to Machine Learning and Mathematical Preliminaries
2. Classification
3. Regression
4. Clustering
5. Deep Learning
6. Miscellaneous Unsupervised Learning
7. Text Mining
8. Machine Learning Models in Production
9. Case Studies and Storytelling