eBook - ePub

OpenCV 3 Computer Vision with Python Cookbook

Name: OpenCV 3 Computer Vision with Python Cookbook
Author: Alexey Spizhevoy, Aleksandr Rybnikov

Alexey Spizhevoy,

Aleksandr Rybnikov,

306 pages
English
ePUB (mobile friendly)
Available on iOS & Android

eBook - ePub

OpenCV 3 Computer Vision with Python Cookbook

Alexey Spizhevoy,

Aleksandr Rybnikov,

Book details

Book preview

Table of contents

Citations

About This Book

Recipe-based approach to tackle the most common problems in Computer Vision by leveraging the functionality of OpenCV using Python APIsAbout This Book• Build computer vision applications with OpenCV functionality via Python API• Get to grips with image processing, multiple view geometry, and machine learning• Learn to use deep learning models for image classification, object detection, and face recognitionWho This Book Is ForThis book is for developers who have a basic knowledge of Python. If you are aware of the basics of OpenCV and are ready to build computer vision systems that are smarter, faster, more complex, and more practical than the competition, then this book is for you.What You Will Learn• Get familiar with low-level image processing methods• See the common linear algebra tools needed in computer vision• Work with different camera models and epipolar geometry• Find out how to detect interesting points in images and compare them• Binarize images and mask out regions of interest• Detect objects and track them in videosIn DetailOpenCV 3 is a native cross-platform library for computer vision, machine learning, and image processing. OpenCV's convenient high-level APIs hide very powerful internals designed for computational efficiency that can take advantage of multicore and GPU processing. This book will help you tackle increasingly challenging computer vision problems by providing a number of recipes that you can use to improve your applications.In this book, you will learn how to process an image by manipulating pixels and analyze an image using histograms. Then, we'll show you how to apply image filters to enhance image content and exploit the image geometry in order to relay different views of a pictured scene. We'll explore techniques to achieve camera calibration and perform a multiple-view analysis.Later, you'll work on reconstructing a 3D scene from images, converting low-level pixel information to high-level concepts for applications such as object detection and recognition. You'll also discover how to process video from files or cameras and how to detect and track moving objects. Finally, you'll get acquainted with recent approaches in deep learning and neural networks.By the end of the book, you'll be able to apply your skills in OpenCV to create computer vision applications in various domains.Style and approachThis book helps you learn the core concepts of OpenCV faster by taking a recipe-based approach where you can try out different code snippets to understand a concept.

Frequently asked questions

Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.

At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.

Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.

We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.

Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.

Yes, you can access OpenCV 3 Computer Vision with Python Cookbook by Alexey Spizhevoy, Aleksandr Rybnikov in PDF and/or ePUB format, as well as other popular books in Computer Science & Programming in Python. We have over one million books available in our catalogue for you to explore.

Information

Publisher

Packt Publishing

Year

2018

ISBN

9781788478755

Edition

Topic

Computer Science

Subtopic

Programming in Python

Index

Computer Science

Object Detection and Machine Learning

In this chapter, we will cover the following recipes:

Obtaining an object mask using the GrabCut algorithm
Finding edges using the Canny algorithm
Detecting lines and circles using the Hough transform
Finding objects via template matching
The real-time median-flow object tracker
Tracking objects using different algorithms via the tracking API
Computing the dense optical flow between two frames
Detecting chessboard and circle grid patterns
A simple pedestrian detector using the SVM model
Optical character recognition using different machine learning models
Detecting faces using Haar/LBP cascades
Detecting AruCo patterns for AR applications
Detecting text in natural scenes
The QR code detector and recognizer

Introduction

Our world contains a lot of objects. Each type of object has its own features that distinguish it from some types and, at the same time, make it similar to others. Understanding the scene through the objects in it is a key task in computer vision. Being able to find and track various objects, detect basic patterns and complex structures, and recognize text are challenging and useful skills, and this chapter addresses questions on how to implement and use them with OpenCV functionality.

We will review the detection of geometric primitives, such as lines, circles, and chessboards, and more complex objects, such as pedestrians, faces, AruCo, and QR code patterns. We will also perform object tracking tasks.

Obtaining an object mask using the GrabCut algorithm

There are cases where we want to separate an object from other parts of a scene; in other words, where we want to create masks for the foreground and background. This job is tackled by the GrabCut algorithm. It can build object masks in semi-automatic mode. All that it needs are initial assumptions about object location. Based on these assumptions, the algorithm performs a multi-step iterative procedure to model statistical distributions of foreground and background pixels and find the best division according to the distributions. This sounds complicated, but the usage is very simple. Let's find out how easily we can apply this sophisticated algorithm in OpenCV.

Getting ready

Before you proceed with this recipe, you need to install the OpenCV 3.x Python API package.

How to do it...

Import the modules:

import cv2
import numpy as np

Open an image and define the mouse callback function to draw a rectangle on the image:

img = cv2.imread('../data/Lena.png', cv2.IMREAD_COLOR)
show_img = np.copy(img)

mouse_pressed = False
y = x = w = h = 0

def mouse_callback(event, _x, _y, flags, param):
 global show_img, x, y, w, h, mouse_pressed

 if event == cv2.EVENT_LBUTTONDOWN:
 mouse_pressed = True
 x, y = _x, _y
 show_img = np.copy(img)

 elif event == cv2.EVENT_MOUSEMOVE:
 if mouse_pressed:
 show_img = np.copy(img)
 cv2.rectangle(show_img, (x, y),
 (_x, _y), (0, 255, 0), 3)

 elif event == cv2.EVENT_LBUTTONUP:
 mouse_pressed = False
 w, h = _x - x, _y - y

Display the image, and, after the rectangle has been completed and the A button on the keyboard has been pressed, close the window with the following code:

cv2.namedWindow('image')
cv2.setMouseCallback('image', mouse_callback)

while True:
 cv2.imshow('image', show_img)
 k = cv2.waitKey(1)

 if k == ord('a') and not mouse_pressed:
 if w*h > 0:
 break

cv2.destroyAllWindows()

Call cv2.grabCut to create an object mask based on the rectangle that was drawn. Then, create the object mask and define it as:

labels = np.zeros(img.shape[:2],np.uint8)

labels, bgdModel, fgdModel = cv2.grabCut(img, labels, (x, y, w, h), None, None, 5, cv2.GC_INIT_WITH_RECT)

show_img = np.copy(img)
show_img[(labels == cv2.GC_PR_BGD)|(labels == cv2.GC_BGD)] //= 3

cv2.imshow('image', show_img)
cv2.waitKey()
cv2.destroyAllWindows()

Define the mouse callback to draw the mask on the image. It's necessary to repair mistakes in the previous cv2.grabCut call:

label = cv2.GC_BGD
lbl_clrs = {cv2.GC_BGD: (0,0,0), cv2.GC_FGD: (255,255,255)}

def mouse_callback(event, x, y, flags, param):
 global mouse_pressed

 if event == cv2.EVENT_LBUTTONDOWN:
 mouse_pressed = True
 cv2.circle(labels, (x, y), 5, label, cv2.FILLED)
 cv2.circle(show_img, (x, y), 5, lbl_clrs[label], cv2.FILLED)

 elif event == cv2.EVENT_MOUSEMOVE:
 if mouse_pressed:
 cv2.circle(labels, (x, y), 5, label, cv2.FILLED)
 cv2.circle(show_img, (x, y), 5, lbl_clrs[label], cv2.FILLED)

 elif event == cv2.EVENT_LBUTTONUP:
 mouse_pressed = False

Show the image with the mask; use white to draw where the object pixels have been labeled as a background, and use black to draw where the background areas have been marked as belonging to the object. Then, call cv2.grabCut again to get the fixed mask. Finally, update the mask on the image, and show it:

cv2.namedWindow('image')
cv2.setMouseCallback('image', mouse_callback)

while True:
 cv2.imshow('image', show_img)
 k = cv2.waitKey(1)

 if k == ord('a') and not mouse_pressed:
 break
 elif k == ord('l'):
 label = cv2.GC_FGD - label

cv2.destroyAllWindows()

labels, bgdModel, fgdModel = cv2.grabCut(img, labels, None, bgdModel, fgdModel, 5, cv2.GC_INIT_WITH_MASK)

show_img = np.copy(img)
show_img[(labels == cv2.GC_PR_BGD)|(labels == cv2.GC_BGD)] //= 3

cv2.imshow('image', show_img)
cv2.waitKey()
cv2.destroyAllWindows()

How it works...

OpenCV's cv2.grabCut implements the GrabCut algorithm. This function is able to work in several modes, and takes the following arguments: input 3-channel image, a matrix with initial labels for pixels, a rectangle in (x, y, w, h) format to define label initialization, two matrices to store the process state, a number of iterations, and the mode in which we want the function to launch.

The function returns labels matrix and two matrices with the state of the process. The labels matrix is single-channel, and it stores one of these values in each pixel: cv2.GC_BGD (this means that the pixel definitely belongs to the background), cv2.GC_PR_BGD (this means that the pixel is probably in the background), cv2.GC_PR_FGD (for pixels which are possibly foreground), cv2.GC_FGD (for pixels which are definitely foreground). The two state matrices are necessary if we want to continue the process for a few iterations.

There are three possible modes for the function: cv2.GC...

Title Page
Copyright and Credits
Packt Upsell
Contributors
Preface
I/O and GUI
Matrices, Colors, and Filters
Contours and Segmentation
Object Detection and Machine Learning
Deep Learning
Linear Algebra
Detectors and Descriptors
Image and Video Processing
Multiple View Geometry
Other Books You May Enjoy