Distributed Computing with Python
eBook - ePub

Distributed Computing with Python

  1. 170 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Distributed Computing with Python

Book details
Book preview
Table of contents
Citations

About This Book

Harness the power of multiple computers using Python through this fast-paced informative guide

About This Book

  • You'll learn to write data processing programs in Python that are highly available, reliable, and fault tolerant
  • Make use of Amazon Web Services along with Python to establish a powerful remote computation system
  • Train Python to handle data-intensive and resource hungry applications

Who This Book Is For

This book is for Python developers who have developed Python programs for data processing and now want to learn how to write fast, efficient programs that perform CPU-intensive data processing tasks.

What You Will Learn

  • Get an introduction to parallel and distributed computing
  • See synchronous and asynchronous programming
  • Explore parallelism in Python
  • Distributed application with Celery
  • Python in the Cloud
  • Python on an HPC cluster
  • Test and debug distributed applications

In Detail

CPU-intensive data processing tasks have become crucial considering the complexity of the various big data applications that are used today. Reducing the CPU utilization per process is very important to improve the overall speed of applications.


This book will teach you how to perform parallel execution of computations by distributing them across multiple processors in a single machine, thus improving the overall performance of a big data processing task. We will cover synchronous and asynchronous models, shared memory and file systems, communication between various processes, synchronization, and more.

Style and Approach

This example based, step-by-step guide will show you how to make the best of your hardware configuration using Python for distributing applications.

Frequently asked questions

Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes, you can access Distributed Computing with Python by Francesco Pierfederici in PDF and/or ePUB format, as well as other popular books in Computer Science & Programming in Python. We have over one million books available in our catalogue for you to explore.

Information

Year
2016
ISBN
9781785889691
Edition
1

Distributed Computing with Python


Table of Contents

Distributed Computing with Python
Credits
About the Author
About the Reviewer
www.PacktPub.com
eBooks, discount offers, and more
Why subscribe?
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Downloading the example code
Errata
Piracy
Questions
1. An Introduction to Parallel and Distributed Computing
Parallel computing
Distributed computing
Shared memory versus distributed memory
Amdahl's law
The mixed paradigm
Summary
2. Asynchronous Programming
Coroutines
An asynchronous example
Summary
3. Parallelism in Python
Multiple threads
Multiple processes
Multiprocess queues
Closing thoughts
Summary
4. Distributed Applications – with Celery
Establishing a multimachine environment
Installing Celery
Testing the installation
A tour of Celery
More complex Celery applications
Celery in production
Celery alternatives – Python-RQ
Celery alternatives – Pyro
Summary
5. Python in the Cloud
Cloud computing and AWS
Creating an AWS account
Creating an EC2 instance
Storing data in Amazon S3
Amazon elastic beanstalk
Creating a private cloud
Summary
6. Python on an HPC Cluster
Your typical HPC cluster
Job schedulers
Running a Python job using HTCondor
Running a Python job using PBS
Debugging
Summary
7. Testing and Debugging Distributed Applications
The big picture
Common problems – clocks and time
Common problems – software environments
Common problems – permissions and environments
Common problems – the availability of hardware resources
Challenges – the development environment
A useful strategy – logging everything
A useful strategy – simulating components
Summary
8. The Road Ahead
The first two chapters
The tools
The cloud and the HPC world
Debugging and monitoring
Where to go next
Index

Distributed Computing with Python

Copyright © 2016 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
First published: April 2016
Production reference: 1060416
Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham B3 2PB, UK.
ISBN 978-1-78588-969-1
www.packtpub.com

Credits

Author
Francesco Pierfederici
Reviewer
James King
Commissioning Editor
Veena Pagare
Acquisition Editor
Aaron Lazar
Content Development Editor
Parshva Sheth
Technical Editor
Abhishek R. Kotian
Copy Editor
Neha Vyas
Project Coordinator
Nikhil Nair
Proofreader
Safis Editing
Indexer
Rekha Nair
Graphics
Disha Haria
Production Coordinator
Melwyn Dsa
Cover Work
Melwyn Dsa

About the Author

Francesco Pierfederici is a software engineer who loves Python. He has been working in the fields of astronomy, biology, and numerical weather forecasting for the last 20 years.
He has built large distributed systems that make use of tens of thousands of cores at a time and run on some of the fastest supercomputers in the world. He has also written a lot of applications of dubious usefulness but that are great fun. Mostly, he just likes to build things.

About the Reviewer

James King is a software developer with a broad range of experience in distributed systems. He is a contributor to many open source projects including OpenStack and Mozilla Firefox. He enjoys mathematics, horsing around with his kids, games, and art.

www.PacktPub.com

eBooks, discount offers, and more

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at for more details.
At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.
eBooks, discount offers, and more
https://www2.packtpub.com/books/subscription/packtlib
Do you need instant solutions to your IT questions? PacktLib is Packt's online digital book library. Here, you can search, access, and read Packt's entire library of books.

Why subscribe?

  • Fully searchable across every book published by Packt
  • Copy and paste, print, and bookmark content
  • On demand and accessible via a web browser

Preface

Parallel and distributed computing is a fascinating subject that only a few years ago developers in only a very few large companies and national labs were privy to. Things have changed dramatically in the last decade or so, and now everybody can build small- and medium-scale distributed applications in a variety of programming languages including, of course, our favorite one: Python.
This book is a very practical guide for Python programmers who are starting to build their own distributed systems. It starts off by illustrating the bare minimum theoretical concepts needed to understand parallel and distributed computing in order to lay the basic foundations required for the rest of the (more practical) chapters.
It then looks at some first examples of parallelism using nothing more than modules from the Python standard library. The next step is to move beyond the confines of a single computer and start using more and more nodes. This is accomplished using a number of third-party libraries, including Celery and Pyro.
The remaining chapters investigate a few deployment options for our distributed applications. The cloud and classic High Performance Computing (HPC) clusters, together with their strengths and challenges, take center stage.
Finally, the thorny issues of monitoring, logging, profiling, and debugging are touched upon.
All in all, this is very much a hands-on bo...

Table of contents

  1. Distributed Computing with Python