![Applied Unsupervised Learning with R](https://img.perlego.com/book-covers/955535/9781789951462_300_450.webp)
Applied Unsupervised Learning with R
Uncover hidden relationships and patterns with k-means clustering, hierarchical clustering, and PCA
Alok Malik, Bradford Tuckfield
- 320 pages
- English
- ePUB (adapté aux mobiles)
- Disponible sur iOS et Android
Applied Unsupervised Learning with R
Uncover hidden relationships and patterns with k-means clustering, hierarchical clustering, and PCA
Alok Malik, Bradford Tuckfield
Ă propos de ce livre
Design clever algorithms that discover hidden patterns and draw responses from unstructured, unlabeled data.
Key Features
- Build state-of-the-art algorithms that can solve your business' problems
- Learn how to find hidden patterns in your data
- Revise key concepts with hands-on exercises using real-world datasets
Book Description
Starting with the basics, Applied Unsupervised Learning with R explains clustering methods, distribution analysis, data encoders, and features of R that enable you to understand your data better and get answers to your most pressing business questions.
This book begins with the most important and commonly used method for unsupervised learning - clustering - and explains the three main clustering algorithms - k-means, divisive, and agglomerative. Following this, you'll study market basket analysis, kernel density estimation, principal component analysis, and anomaly detection. You'll be introduced to these methods using code written in R, with further instructions on how to work with, edit, and improve R code. To help you gain a practical understanding, the book also features useful tips on applying these methods to real business problems, including market segmentation and fraud detection. By working through interesting activities, you'll explore data encoders and latent variable models.
By the end of this book, you will have a better understanding of different anomaly detection methods, such as outlier detection, Mahalanobis distances, and contextual and collective anomaly detection.
What you will learn
- Implement clustering methods such as k-means, agglomerative, and divisive
- Write code in R to analyze market segmentation and consumer behavior
- Estimate distribution and probabilities of different outcomes
- Implement dimension reduction using principal component analysis
- Apply anomaly detection methods to identify fraud
- Design algorithms with R and learn how to edit or improve code
Who this book is for
Applied Unsupervised Learning with R is designed for business professionals who want to learn about methods to understand their data better, and developers who have an interest in unsupervised learning. Although the book is for beginners, it will be beneficial to have some basic, beginner-level familiarity with R. This includes an understanding of how to open the R console, how to read data, and how to create a loop. To easily understand the concepts of this book, you should also know basic mathematical concepts, including exponents, square roots, means, and medians.
Foire aux questions
Informations
Chapter 1
Introduction to Clustering Methods
Learning Objectives
- Describe the uses of clustering
- Perform the k-means algorithm using built-in R libraries
- Perform the k-medoids algorithm using built-in R libraries
- Determine the optimum number of clusters
Introduction
![Figure 1.1: Increase in data year on year](OEBPS/image/C12628_01_01-plgo-compressed.webp)
Figure 1.1: The increase in digital data year on year
Introduction to Clustering
![Figure 1.2: Representation of two clusters in a dataset](OEBPS/image/C12628_01_02-plgo-compressed.webp)
Figure 1.2: A representation of two clusters in a dataset
![Figure 1.3: Representation of three clusters in a dataset](OEBPS/image/C12628_01_03-plgo-compressed.webp)
Figure 1.3: A representation of three clusters in a dataset
Uses of Clustering
- Exploratory data analysis: When we have unlabeled data, we often do clustering to explore the underlying structure and categories of the dataset. For example, a retail store might want to explore how many different segments of customers they have, based on purchase history.
- Generating training data: Sometimes, after processing unlabeled data with clustering methods, it can be labeled for further training with supervised learning algorithms. For example, two different classes that are unlabeled might form two entirely different clusters, and using their clusters, we can label data for further supervised learning algorithms that are more efficient in real-time classification than our unsupervised learning algorithms.
- Recommender systems: With the help of clustering, we can find the properties of similar items and use these properties to make recommendations. For example, an e-commerce website, after finding customers in the same clusters, can recommend items to customers in that cluster based upon the items bought by other customers in that cluster.
- Natural language processing: Clustering can be used for the grouping of similar words, texts, articles, or tweets, without labeled data. For example, you might want to group articles on the same topic automatically.
- Anomaly detection: You can use clustering to find outliers. We're going to learn about this in Chapter 6, Anomaly Detection. Anomaly dete...
Table des matiĂšres
- Preface
- Chapter 1
- Chapter 2
- Chapter 3
- Chapter 4
- Chapter 5
- Chapter 6
- Appendix