eBook - ePub

Hands-On Reinforcement Learning with Python

Name: Hands-On Reinforcement Learning with Python
Author: Sudharsan Ravichandiran

Master reinforcement and deep reinforcement learning using OpenAI Gym and TensorFlow

Sudharsan Ravichandiran

318 pages
English
ePUB (adapté aux mobiles)
Disponible sur iOS et Android

eBook - ePub

Hands-On Reinforcement Learning with Python

Master reinforcement and deep reinforcement learning using OpenAI Gym and TensorFlow

Sudharsan Ravichandiran

Détails du livre

Aperçu du livre

Table des matières

Citations

À propos de ce livre

A hands-on guide enriched with examples to master deep reinforcement learning algorithms with Python

Key Features

Your entry point into the world of artificial intelligence using the power of Python
An example-rich guide to master various RL and DRL algorithms
Explore various state-of-the-art architectures along with math

Book Description

Reinforcement Learning (RL) is the trending and most promising branch of artificial intelligence. Hands-On Reinforcement learning with Python will help you master not only the basic reinforcement learning algorithms but also the advanced deep reinforcement learning algorithms.

The book starts with an introduction to Reinforcement Learning followed by OpenAI Gym, and TensorFlow. You will then explore various RL algorithms and concepts, such as Markov Decision Process, Monte Carlo methods, and dynamic programming, including value and policy iteration. This example-rich guide will introduce you to deep reinforcement learning algorithms, such as Dueling DQN, DRQN, A3C, PPO, and TRPO. You will also learn about imagination-augmented agents, learning from human preference, DQfD, HER, and many more of the recent advancements in reinforcement learning.

By the end of the book, you will have all the knowledge and experience needed to implement reinforcement learning and deep reinforcement learning in your projects, and you will be all set to enter the world of artificial intelligence.

What you will learn

Understand the basics of reinforcement learning methods, algorithms, and elements
Train an agent to walk using OpenAI Gym and Tensorflow
Understand the Markov Decision Process, Bellman's optimality, and TD learning
Solve multi-armed-bandit problems using various algorithms
Master deep learning algorithms, such as RNN, LSTM, and CNN with applications
Build intelligent agents using the DRQN algorithm to play the Doom game
Teach agents to play the Lunar Lander game using DDPG
Train an agent to win a car racing game using dueling DQN

Who this book is for

If you're a machine learning developer or deep learning enthusiast interested in artificial intelligence and want to learn about reinforcement learning from scratch, this book is for you. Some knowledge of linear algebra, calculus, and the Python programming language will help you understand the concepts covered in this book.

Foire aux questions

Comment puis-je résilier mon abonnement ?

Il vous suffit de vous rendre dans la section compte dans paramètres et de cliquer sur « Résilier l’abonnement ». C’est aussi simple que cela ! Une fois que vous aurez résilié votre abonnement, il restera actif pour le reste de la période pour laquelle vous avez payé. Découvrez-en plus ici.

Puis-je / comment puis-je télécharger des livres ?

Pour le moment, tous nos livres en format ePub adaptés aux mobiles peuvent être téléchargés via l’application. La plupart de nos PDF sont également disponibles en téléchargement et les autres seront téléchargeables très prochainement. Découvrez-en plus ici.

Quelle est la différence entre les formules tarifaires ?

Les deux abonnements vous donnent un accès complet à la bibliothèque et à toutes les fonctionnalités de Perlego. Les seules différences sont les tarifs ainsi que la période d’abonnement : avec l’abonnement annuel, vous économiserez environ 30 % par rapport à 12 mois d’abonnement mensuel.

Qu’est-ce que Perlego ?

Nous sommes un service d’abonnement à des ouvrages universitaires en ligne, où vous pouvez accéder à toute une bibliothèque pour un prix inférieur à celui d’un seul livre par mois. Avec plus d’un million de livres sur plus de 1 000 sujets, nous avons ce qu’il vous faut ! Découvrez-en plus ici.

Prenez-vous en charge la synthèse vocale ?

Recherchez le symbole Écouter sur votre prochain livre pour voir si vous pouvez l’écouter. L’outil Écouter lit le texte à haute voix pour vous, en surlignant le passage qui est en cours de lecture. Vous pouvez le mettre sur pause, l’accélérer ou le ralentir. Découvrez-en plus ici.

Est-ce que Hands-On Reinforcement Learning with Python est un PDF/ePUB en ligne ?

Oui, vous pouvez accéder à Hands-On Reinforcement Learning with Python par Sudharsan Ravichandiran en format PDF et/ou ePUB ainsi qu’à d’autres livres populaires dans Computer Science et Artificial Intelligence (AI) & Semantics. Nous disposons de plus d’un million d’ouvrages à découvrir dans notre catalogue.

Informations

Éditeur

Packt Publishing

Année

2018

ISBN

9781788836913

Édition

Sujet

Computer Science

Sous-sujet

Artificial Intelligence (AI) & Semantics

Deep Learning Fundamentals

So far, we have learned about how reinforcement learning (RL) works. In the upcoming chapters, we will learn about Deep reinforcement learning (DRL), which is a combination of deep learning and RL. DRL is creating a lot of buzz around the RL community and is making a serious impact on solving many RL tasks. To understand DRL, we need to have a strong foundation in deep learning. Deep learning is actually a subset of machine learning and it is all about neural networks. Deep learning has been around for a decade, but the reason it is so popular right now is because of the computational advancements and availability of a huge volume of data. With this huge volume of data, deep learning algorithms will outperform all classic machine learning algorithms. Therefore, in this chapter, we will learn about several deep learning algorithms like recurrent neural network (RNN), Long Short-Term Memory (LSTM), and convolutional neural network (CNN) algorithms with applications.

In this chapter, you will learn about the following:

Artificial neurons
Artificial neural networks (ANNs)
Building a neural network to classify handwritten digits
RNNs
LSTMs
Generating song lyrics using LSTMs
CNNs
Classifying fashion products using CNNs

Artificial neurons

Before understanding ANN, first, let's understand what neurons are and how neurons in our brain actually work. A neuron can be defined as the basic computational unit of the human brain. Our brain contains approximately 100 billion neurons. Each and every neuron is connected through synapses. Neurons receive input from the external environment, sensory organs, or from the other neurons through a branchlike structure called dendrites, as can be seen in the following diagram. These inputs are strengthened or weakened, that is, they are weighted according to their importance and then they are summed together in the soma (cell body). Then, from the cell body, these summed inputs are processed and move through the axons and are sent to the other neurons. The basic single biological neuron is shown in the following diagram:

Now, how do artificial neurons work? Let's suppose we have three inputs, x₁, x₂, and x_3, to predict output y. These inputs are multiplied by weights, w₁, w₂, and w_3, and are summed together, that is, x₁.w₁ + x₂.w₂ + x₃.w₃. But why are we multiplying these inputs with weights? Because all of the inputs are not equally important in calculating the output y. Let's say that x₂ is more important in calculating the output compared to the other two inputs. Then, we assign a high value to w₂ rather than for the other two weights. So, upon multiplying weights with inputs, x₂ will have a higher value than the other two inputs. After multiplying inputs with the weights, we sum them up and we add a value called bias b. So, z = (x1.w1 + x2.w2 + x3.w3) + b, that is:

Doesn't z look like the equation of linear regression? Isn't it just the equation of a straight line? z = mx + b.

Where m is the weights (coefficients), x is the input, and b is the bias (intercept). Well, yes. Then what is the difference between neurons and linear regression? In neurons, we introduce non-linearity to the result, z, by applying a function f() called the activation or transfer function. So, our output is y = f(z). A single artificial neuron is shown in the following diagram:

In neurons, we take the input x, multiply the input by weights w, and add bias b before applying the activation function f(z) to this result and predict the output y.

ANNs

Neurons are cool, right? But single neurons cannot perform complex tasks, which is why our brain has billions of neurons, organized in layers, forming a network. Similarly, artificial neurons are arranged in layers. Each and every layer will be connected in such a way that information is passed from one layer to another. A typical ANN consists of the following layers:

Input layer
Hidden layer
Output layer

Each layer has a collection of neurons, and the neurons in one layer interact with all the neurons in the other layers. However, neurons in the same layer will not interact with each other. A typical ANN is shown in the following diagram:

Input layer

The input layer is where we feed input to the network. The number of neurons in the input layer is the number of inputs we feed to the network. Each input will have some influence on predicting the output and this will be multiplied by weights, while bias will be added and passed to the next layer.

Hidden layer

Any layer between the input layer and the output layer is called a hidden layer. It processes the input received from the input layer. The hidden layer is responsible for deriving complex relationships between input and output. That is, the hidden layer identifies the pattern in the dataset. There can be any number of hidden layers, however we have to choose a number of hidden layers according to our problem. For a very simple problem, we can just use one hidden layer, but while performing complex tasks like image recognition, we use many hidden layers where each layer is responsible for extracting important...