What do we mean by machine learning? It's an interdisciplinary subject that cares about the development, comprehension, and application of computational methods meant to learn and generalize from datasets; it's usually related but not limited to big data. Machine learning shores up a family of ever-growing methods, suitable for overcoming a wide range of problems.
I deeply appreciate how it has been used to fight junk email. The way it suggests replies to emails (that hardly are spam) proved to be of enormous aid too.
Such a great ability to solve problems certainly attracted big companies and tech geeks all over the world.
Netflix is uses machine learning to give you personal recommendations of content to watch; Amazon uses machine learning to recommend products to buy based on what you've already bought. These are the so-called recommenders. They are usually (but not only) built using clustering techniques.
Machine learning techniques have been also used to diagnose illnesses. Aside from the application of clustering in cancer diagnosis already mentioned in Chapter 4, KDD, Data Mining, and Text Mining, neural networks can be trained to read various exams and even predict how likely a patient is to develop certain kinds of diseasesāthis field is called predictive medicine and highly benefits from machine learning advancements.
Saving endangered species is yet another wonderful usage of machine learning. Researchers from the University of Southern California Center for AI in Society have trained a neural network to detect illegal hunters that set foot in national parks from Zimbabwe and Malawi. This system is designed to distinguish hunters from animals using heat signatures and was baptized as Systematic POacher deTector (SPOT).
There are unconventional uses of machine learning models. Some folks are using it to compose songs, poems, and draw figures.
Tech workers, such as Zach Lubarsky and Ethan Phelps-Goodman, are actively engaging in data-driven campaigns to solve social issues. Lubarsky and Phelps-Goodman belong to the Seattle Tech 4 Housing organization, a community dedicated to improving Seattle's residence affordability.
A quick web search will tell you that there are many real-world applications of machine learning as there are stars in the sky. Talking about stars, how do you think that the galactical sized datasets generated by astronomers are being processed? That's right, machine learning.
This collection of methods can be separated into two classes: unsupervised (unlabeled) and supervised (labeled) learning. For the former, there is no target value to fit the modelsāhierarchical clusters are a good example of those. The objective of unsupervised learning is usually, but not always, to extract features from data rather than actual forecasts.
Next we will be looking at how traditional statistics connect to machine learning. There are many clear connections linking both streams. To mention one, regressions from traditional statistics can also be seen in machine learning applications. Ronald Fisher, a well-renowned statistician, is recognized by some people to be among the first individuals to use machine learning.
Supervised learning models are trained to target one or more variables; hence you need labeled data. Recurrent neural networks (RNNs) can be cited as a supervised learning technique. Although practical examples for both classes are provided in this chapter, more attention is given to unsupervised learning, since supervised is focused on in further chapters such as Chapter 8, Neural Networks and Deep Learning.
Although many concepts adopted in machine learning field are essentially the same as the ones that arose from traditional statisticians and forecasters, machine learning has a vocabulary of its own. Differences may have originated due to the main proponents of the field being more related to computing than statistics.
There is no downside to learning this vocabulary. A great way to do so is to relate machine learning terms to statistical ones. Moving on to the next section, we can see how many core ideas from machine learning can be somehow translated into statistical concepts.
At the end of the last section, we already hypothesized why machine learning managed to diverge in vacabulary from statistcs. Let me begin this section by discussing why the core ideas converge in essence. Many statistical methods crave to prae e videre, that is Latin for to see something that did not happen yet before it actually does, or simply, predict.
Prediction tasks, as other pattern recognition duties, often require a very sharp ability to comprehend data and generalize well into yet unseen information. This sort of shared goal drove the distinct efforts from traditional statistics and machine learning to many common places. Also, statistics, virtue to conceive all sorts of events in a probabilistic way makes it very useful to machine learning, which could be another source of shared ground acrross the different fields, not to mention the interdisciplinary nature of machine learning.
No matter the reason for that, machine learning vocabulary can be adapted and understood through statistics. This translation makes it especially easy for lovers of statistics to master machine learning and vice versa. The paper, Neural Networks and Statistical Models, written by Warren S. Sarle and published in 1994, showed how machine learning jargon could be related to statistical jargon. Here are some jargons:
| Statistical jargon | Machine learning correspondent |
| Model estimation | Model training or learning |
| Estimation criteria | Cost function |
| Variables | Features |
| Independent variables | Inputs |
| Predicted values | Outputs |
| Dependent variables | Training or target values |
Now that we acknowledge the existence of a link between statistics and machine learning, the time is coming to take a practical tour through the traditional methods of linear regression given by statistics using our beloved Rābut not before examining the general tasks that machine learning is up to.
Whether a problem can be solved through machine learning is only a matter of how much data, creativity, and computational power does one have. Machine learning can be used to aid diagnosis, draw recommendations, classify stellar objects, protect animal life and tackle social issues.
It can likewise be used to detect frauds, such as fraudulent credit card t...