Statistics for Machine Learning
eBook - ePub

Statistics for Machine Learning

Implement Statistical methods used in Machine Learning using Python (English Edition)

  1. English
  2. ePUB (mobile friendly)
  3. Available on iOS & Android
eBook - ePub

Statistics for Machine Learning

Implement Statistical methods used in Machine Learning using Python (English Edition)

Book details
Book preview
Table of contents
Citations

About This Book

A practical guide that will help you understand the Statistical Foundations of any Machine Learning Problem. Key Features

  • Develop a Conceptual and Mathematical understanding of Statistics
  • Get an overview of Statistical Applications in Python
  • Learn how to perform Hypothesis testing in Statistics
  • Understand why Statistics is important in Machine Learning
  • Learn how to process data in Python

  • Description
    This book talks about Statistical concepts in detail, with its applications in Python. The book starts with an introduction to Statistics and moves on to cover some basic Descriptive Statistics concepts such as mean, median, mode, etc. You will then explore the concept of Probability and look at different types of Probability Distributions. Next, you will look at parameter estimations for the unknown parameters present in the population and look at Random Variables in detail, which are used to save the results of an experiment in Statistics. You will then explore one of the most important fields in Statistics - Hypothesis Testing, and then explore various types of tests used to check our hypothesis. The last part of our book will focus on how you can process data using Python, some elements of Non-parametric statistics, and finally, some introduction to Machine Learning. What you will learn
  • Understand the basics of Statistics
  • Get to know more about Descriptive Statistics
  • Understand and learn advanced Statistics techniques
  • Learn how to apply Statistical concepts in Python
  • Understand important Python packages for Statistics and Machine Learning

  • Who this book is for
    This book is for anyone who wants to understand Statistics and its use in Machine Learning. This book will help you understand the Mathematics behind the Statistical concepts and the applications using the Python language. Having a working knowledge of the Python language is a prerequisite. Table of Contents
    1. Introduction to Statistics
    2. Descriptive Statistics
    3. Probability
    4. Random Variables
    5. Parameter Estimations
    6. Hypothesis Testing
    7. Analysis of Variance
    8. Regression
    9. Non Parametric Statistics
    10. Data Analysis using Python
    11. Introduction to Machine Learning About the Authors
    Himanshu Singh is an AI Technology Lead at Legato Healthcare (An Anthem Inc. Company). He has around 7 years of experience in the domain of Machine Learning and Artificial Intelligence. Himanshu is an author of three books in Machine Learning and is a trainer by passion. He is a guest faculty at various institutes like Narsee Monjee Institute of Management Studies, IMT, Vignana Jyothi Institute of Management. LinkedIn Profile: https://www.linkedin.com/in/himanshu-singh-2264a350/
    Blog links: https://medium.com/@himanshuit3036
    Facebook Profile: https://www.facebook.com/silli23

Frequently asked questions

Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes, you can access Statistics for Machine Learning by Himanshu Singh in PDF and/or ePUB format, as well as other popular books in Computer Science & Artificial Intelligence (AI) & Semantics. We have over one million books available in our catalogue for you to explore.

CHAPTER 1

Introduction to Statistics

This chapter focuses on the various parameters related to statistics. It will guide you through all the ingredients required for the statistical recipes.

Structure

  • Population and sample
  • Introduction to random variables
  • Other variables
  • Introduction to descriptive statistics
  • Visualizations

Objectives

This chapter aims to provide readers with the base for statistics and statistical Python.

Population and Sample

Suppose I want to start a new service or product-based company. The company type and the way it is operated may differ, but your company will fail if it is offering something that no one needs.
But how to know whether your offering is correct? Will people like it? Is there a need for it? There is only one answer to solving these doubts—market research.
Whenever a company launches a new product, it carries out market research to determine the product feasibility, the areas in which the product has the highest demand, the demographics that the company should target, and such. Without research, it’s like shooting an arrow into the dark.
Research is not limited to business, and you can find its application in all walks of life. From politics to sports and even a movie launch are nothing without research.
Research begins with determining the target audience or target market. Suppose we are making a cosmetic product, our target market can be females over 15 years of age who live in metropolitan cities. Everyone who meets the above criteria, or any criterion that the research team makes, is considered part of our population. A team starts its research only after they have carefully drafted the criteria to be met by the population. Once this is done, they come up with samples.
There are various reasons why we must draw samples out of our entire population. We will look at all these reasons in Chapter 6, but the most important reason is the inability to cover the entire population. Although we know our target population, it is next to impossible to reach each person and interview them. So, different approaches are used to draw samples of the population and apply the research. Given here is a list of the approaches to draw samples (we’ll discuss all of them in detail in Chapter 6).
Probabilistic sampling:
  • Random sampling
  • Sequential random sampling
  • Cluster sampling
  • Stratified sampling
Non-probabilistic sampling:
  • Judgment sampling
  • Convenience sampling
  • Snowball sampling
  • Quota sampling

Introduction to Random Variables

How we do carry out the research?
We define the questions related to the research and the instruments to measure the answers. The questions can be open-ended or closed-ended. The former are ones in which specific answers do not limit us, and we can write whatever we feel like. For example:
What do you feel about the current election scenario?
Now, the answer to this question will differ drastically for different people. Some may give positive answers, while others may give negative ones, and the language used will always differ.
When it comes to closed-ended questions, the response is limited. For example:
What is your age?
  • 10-20
  • 20-30
  • 30-40
  • 40-50
  • 50+
In the preceding example, the respondent has a limited number of options to choose from. They cannot give any other input.
Figure 1.1
Now, once a respondent has submitte...

Table of contents

  1. Cover Page
  2. Title Page
  3. Copyright Page
  4. Dedication Page
  5. About the Author
  6. About the Reviewer
  7. Acknowledgements
  8. Preface
  9. Errata
  10. Table of Contents
  11. 1. Introduction to Statistics
  12. 2. Descriptive Statistics
  13. 3. Random Variables
  14. 4. Probability
  15. 5. Parameter Estimation
  16. 6. Hypothesis Testing
  17. 7. Analysis of Variance
  18. 8. Regression
  19. 9. Data Analysis Using Python
  20. 10. Non-Parametric Statistics
  21. 11. Introduction to Machine Learning
  22. Index