Python Reinforcement Learning
eBook - ePub

Python Reinforcement Learning

Solve complex real-world problems by mastering reinforcement learning algorithms using OpenAI Gym and TensorFlow

Sudharsan Ravichandiran, Sean Saito, Rajalingappaa Shanmugamani, Yang Wenzhuo

  1. 496 pages
  2. English
  3. ePUB (adapté aux mobiles)
  4. Disponible sur iOS et Android
eBook - ePub

Python Reinforcement Learning

Solve complex real-world problems by mastering reinforcement learning algorithms using OpenAI Gym and TensorFlow

Sudharsan Ravichandiran, Sean Saito, Rajalingappaa Shanmugamani, Yang Wenzhuo

DĂ©tails du livre
Aperçu du livre
Table des matiĂšres
Citations

À propos de ce livre

Apply modern reinforcement learning and deep reinforcement learning methods using Python and its powerful libraries

Key Features

  • Your entry point into the world of artificial intelligence using the power of Python
  • An example-rich guide to master various RL and DRL algorithms
  • Explore the power of modern Python libraries to gain confidence in building self-trained applications

Book Description

Reinforcement Learning (RL) is the trending and most promising branch of artificial intelligence. This Learning Path will help you master not only the basic reinforcement learning algorithms but also the advanced deep reinforcement learning algorithms.

The Learning Path starts with an introduction to RL followed by OpenAI Gym, and TensorFlow. You will then explore various RL algorithms, such as Markov Decision Process, Monte Carlo methods, and dynamic programming, including value and policy iteration. You'll also work on various datasets including image, text, and video. This example-rich guide will introduce you to deep RL algorithms, such as Dueling DQN, DRQN, A3C, PPO, and TRPO. You will gain experience in several domains, including gaming, image processing, and physical simulations. You'll explore TensorFlow and OpenAI Gym to implement algorithms that also predict stock prices, generate natural language, and even build other neural networks. You will also learn about imagination-augmented agents, learning from human preference, DQfD, HER, and many of the recent advancements in RL.

By the end of the Learning Path, you will have all the knowledge and experience needed to implement RL and deep RL in your projects, and you enter the world of artificial intelligence to solve various real-life problems.

This Learning Path includes content from the following Packt products:

  • Hands-On Reinforcement Learning with Python by Sudharsan Ravichandiran
  • Python Reinforcement Learning Projects by Sean Saito, Yang Wenzhuo, and Rajalingappaa Shanmugamani

What you will learn

  • Train an agent to walk using OpenAI Gym and TensorFlow
  • Solve multi-armed-bandit problems using various algorithms
  • Build intelligent agents using the DRQN algorithm to play the Doom game
  • Teach your agent to play Connect4 using AlphaGo Zero
  • Defeat Atari arcade games using the value iteration method
  • Discover how to deal with discrete and continuous action spaces in various environments

Who this book is for

If you're an ML/DL enthusiast interested in AI and want to explore RL and deep RL from scratch, this Learning Path is for you. Prior knowledge of linear algebra is expected.

Foire aux questions

Comment puis-je résilier mon abonnement ?
Il vous suffit de vous rendre dans la section compte dans paramĂštres et de cliquer sur « RĂ©silier l’abonnement ». C’est aussi simple que cela ! Une fois que vous aurez rĂ©siliĂ© votre abonnement, il restera actif pour le reste de la pĂ©riode pour laquelle vous avez payĂ©. DĂ©couvrez-en plus ici.
Puis-je / comment puis-je télécharger des livres ?
Pour le moment, tous nos livres en format ePub adaptĂ©s aux mobiles peuvent ĂȘtre tĂ©lĂ©chargĂ©s via l’application. La plupart de nos PDF sont Ă©galement disponibles en tĂ©lĂ©chargement et les autres seront tĂ©lĂ©chargeables trĂšs prochainement. DĂ©couvrez-en plus ici.
Quelle est la différence entre les formules tarifaires ?
Les deux abonnements vous donnent un accĂšs complet Ă  la bibliothĂšque et Ă  toutes les fonctionnalitĂ©s de Perlego. Les seules diffĂ©rences sont les tarifs ainsi que la pĂ©riode d’abonnement : avec l’abonnement annuel, vous Ă©conomiserez environ 30 % par rapport Ă  12 mois d’abonnement mensuel.
Qu’est-ce que Perlego ?
Nous sommes un service d’abonnement Ă  des ouvrages universitaires en ligne, oĂč vous pouvez accĂ©der Ă  toute une bibliothĂšque pour un prix infĂ©rieur Ă  celui d’un seul livre par mois. Avec plus d’un million de livres sur plus de 1 000 sujets, nous avons ce qu’il vous faut ! DĂ©couvrez-en plus ici.
Prenez-vous en charge la synthÚse vocale ?
Recherchez le symbole Écouter sur votre prochain livre pour voir si vous pouvez l’écouter. L’outil Écouter lit le texte Ă  haute voix pour vous, en surlignant le passage qui est en cours de lecture. Vous pouvez le mettre sur pause, l’accĂ©lĂ©rer ou le ralentir. DĂ©couvrez-en plus ici.
Est-ce que Python Reinforcement Learning est un PDF/ePUB en ligne ?
Oui, vous pouvez accĂ©der Ă  Python Reinforcement Learning par Sudharsan Ravichandiran, Sean Saito, Rajalingappaa Shanmugamani, Yang Wenzhuo en format PDF et/ou ePUB ainsi qu’à d’autres livres populaires dans Computer Science et Neural Networks. Nous disposons de plus d’un million d’ouvrages Ă  dĂ©couvrir dans notre catalogue.

Informations

Année
2019
ISBN
9781838640149
Édition
1
Sous-sujet
Neural Networks

Learning to Play Go

When considering the capabilities of AI, we often compare its performance for a particular task with what humans can achieve. AI agents are now able to surpass human-level competency in more complex tasks. In this chapter, we will build an agent that learns how to play what is considered the most complex board game of all time: Go. We will become familiar with the latest deep reinforcement learning algorithms that achieve superhuman performances, namely AlphaGo, and AlphaGo Zero, both of which were developed by Google's DeepMind. We will also learn about Monte Carlo tree search, a popular tree-searching algorithm that is an integral component of turn-based game agents.
This chapter will cover the following topics:
  • Introduction to Go and relevant research in AI
  • Overview of AlphaGo and AlphaGo Zero
  • The Monte Carlo tree search algorithm
  • Implementation of AlphaGo Zero

A brief introduction to Go

Go is a board game that was first recorded in China two millennia ago. Similar to other common board games, such as chess, shogi, and Othello, Go involves two players alternately placing black and white stones on a 19x19 board with the objective of capturing as much territory as possible by surrounding a larger total area of the board. One can capture their opponent's pieces by surrounding the opponent's pieces with their own pieces. Captured stones are removed from the board, thereby creating a void in which the opponent can no longer place stones unless the territory is captured back.
A game ends when both players refuse to place a stone or either player resigns. Upon the termination of a game, the winner is decided by counting each player's territory and the number of captured stones.

Go and other board games

Researchers have already created AI programs that outperform the best human players in board games such as chess and backgammon. In 1992, researchers from IBM developed TD-Gammon, which used classic reinforcement learning algorithms and an artificial neural network to play backgammon at the level of a top player. In 1997, Deep Blue, a chess-playing program developed by IBM and Carnegie Mellon University, defeated then world champion Garry Kasparov in a six-game face off. This was the first time that a computer program defeated the world champion in chess.
Developing Go playing agents is not a new topic, and hence one may wonder what took so long for researchers to replicate such successes in Go. The answer is simple—Go, despite its simple rules, is a far more complex game than chess. Imagine representing a board game as a tree, where each node is a snapshot of the board (which we also refer to as the board state) and its child nodes are possible moves the opponent can make. The height of the tree is essentially the number of moves a game lasts. A typical chess game lasts 80 moves, whereas a game in Go lasts 150; almost twice as long. Moreover, while the average number of possible moves in a chess turn is 35, a Go player has 250 possible plays per move. Based on these numbers, Go has 10761 total possible games, compared to 10120 games in chess. It is impossible to enumerate every possible state in Go in a computer, and the sheer complexity of the game has made it difficult for researchers to develop an agent that can play the game at a world-class level.

Go and AI research

In 2015, researchers from Google's DeepMind published a paper in Nature that detailed a novel reinforcement learning agent for Go called AlphaGo. In October of that year, AlphaGo beat Fan Hui, the European champion, 5-0. In 2016, AlphaGo challenged Lee Sedol, who, with 18 world championship titles, is considered one of the greatest players in modern history. AlphaGo won 4-1, marking a watershed moment in deep learning research and the game's history. In the following year, DeepMind published an updated version of AlphaGo, AlphaGo Zero, which defeated its predecessor 100 times in 100 games. In just a matter of days of training, AlphaGo and AlphaGo Zero were able to learn and surpass the wisdom that mankind has accumulated over the thousands of years of the game's existence.
The following sections will discuss how AlphaGo and AlphaGo Zero work, including the algorithms and techniques that they use to learn and play the game. This will be followed by an implementation of AlphaGo Zero. Our exploration begins with Monte Carlo tree search, an algorithm that is integral to both AlphaGo and AlphaGo Zero for making decisio...

Table des matiĂšres

  1. Title Page
  2. Copyright and Credits
  3. About Packt
  4. Contributors
  5. Preface
  6. Introduction to Reinforcement Learning
  7. Getting Started with OpenAI and TensorFlow
  8. The Markov Decision Process and Dynamic Programming
  9. Gaming with Monte Carlo Methods
  10. Temporal Difference Learning
  11. Multi-Armed Bandit Problem
  12. Playing Atari Games
  13. Atari Games with Deep Q Network
  14. Playing Doom with a Deep Recurrent Q Network
  15. The Asynchronous Advantage Actor Critic Network
  16. Policy Gradients and Optimization
  17. Balancing CartPole
  18. Simulating Control Tasks
  19. Building Virtual Worlds in Minecraft
  20. Learning to Play Go
  21. Creating a Chatbot
  22. Generating a Deep Learning Image Classifier
  23. Predicting Future Stock Prices
  24. Capstone Project - Car Racing Using DQN
  25. Looking Ahead
  26. Assessments
  27. Other Books You May Enjoy
Normes de citation pour Python Reinforcement Learning

APA 6 Citation

Ravichandiran, S., Saito, S., Shanmugamani, R., & Wenzhuo, Y. (2019). Python Reinforcement Learning (1st ed.). Packt Publishing. Retrieved from https://www.perlego.com/book/960447/python-reinforcement-learning-solve-complex-realworld-problems-by-mastering-reinforcement-learning-algorithms-using-openai-gym-and-tensorflow-pdf (Original work published 2019)

Chicago Citation

Ravichandiran, Sudharsan, Sean Saito, Rajalingappaa Shanmugamani, and Yang Wenzhuo. (2019) 2019. Python Reinforcement Learning. 1st ed. Packt Publishing. https://www.perlego.com/book/960447/python-reinforcement-learning-solve-complex-realworld-problems-by-mastering-reinforcement-learning-algorithms-using-openai-gym-and-tensorflow-pdf.

Harvard Citation

Ravichandiran, S. et al. (2019) Python Reinforcement Learning. 1st edn. Packt Publishing. Available at: https://www.perlego.com/book/960447/python-reinforcement-learning-solve-complex-realworld-problems-by-mastering-reinforcement-learning-algorithms-using-openai-gym-and-tensorflow-pdf (Accessed: 14 October 2022).

MLA 7 Citation

Ravichandiran, Sudharsan et al. Python Reinforcement Learning. 1st ed. Packt Publishing, 2019. Web. 14 Oct. 2022.