OpenCV 3 Computer Vision with Python Cookbook
eBook - ePub

OpenCV 3 Computer Vision with Python Cookbook

  1. 306 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

OpenCV 3 Computer Vision with Python Cookbook

Book details
Book preview
Table of contents
Citations

About This Book

Recipe-based approach to tackle the most common problems in Computer Vision by leveraging the functionality of OpenCV using Python APIsAbout This Bookā€¢ Build computer vision applications with OpenCV functionality via Python APIā€¢ Get to grips with image processing, multiple view geometry, and machine learningā€¢ Learn to use deep learning models for image classification, object detection, and face recognitionWho This Book Is ForThis book is for developers who have a basic knowledge of Python. If you are aware of the basics of OpenCV and are ready to build computer vision systems that are smarter, faster, more complex, and more practical than the competition, then this book is for you.What You Will Learnā€¢ Get familiar with low-level image processing methodsā€¢ See the common linear algebra tools needed in computer visionā€¢ Work with different camera models and epipolar geometryā€¢ Find out how to detect interesting points in images and compare themā€¢ Binarize images and mask out regions of interestā€¢ Detect objects and track them in videosIn DetailOpenCV 3 is a native cross-platform library for computer vision, machine learning, and image processing. OpenCV's convenient high-level APIs hide very powerful internals designed for computational efficiency that can take advantage of multicore and GPU processing. This book will help you tackle increasingly challenging computer vision problems by providing a number of recipes that you can use to improve your applications.In this book, you will learn how to process an image by manipulating pixels and analyze an image using histograms. Then, we'll show you how to apply image filters to enhance image content and exploit the image geometry in order to relay different views of a pictured scene. We'll explore techniques to achieve camera calibration and perform a multiple-view analysis.Later, you'll work on reconstructing a 3D scene from images, converting low-level pixel information to high-level concepts for applications such as object detection and recognition. You'll also discover how to process video from files or cameras and how to detect and track moving objects. Finally, you'll get acquainted with recent approaches in deep learning and neural networks.By the end of the book, you'll be able to apply your skills in OpenCV to create computer vision applications in various domains.Style and approachThis book helps you learn the core concepts of OpenCV faster by taking a recipe-based approach where you can try out different code snippets to understand a concept.

Frequently asked questions

Simply head over to the account section in settings and click on ā€œCancel Subscriptionā€ - itā€™s as simple as that. After you cancel, your membership will stay active for the remainder of the time youā€™ve paid for. Learn more here.
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
Both plans give you full access to the library and all of Perlegoā€™s features. The only differences are the price and subscription period: With the annual plan youā€™ll save around 30% compared to 12 months on the monthly plan.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, weā€™ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes, you can access OpenCV 3 Computer Vision with Python Cookbook by Alexey Spizhevoy, Aleksandr Rybnikov in PDF and/or ePUB format, as well as other popular books in Computer Science & Programming in Python. We have over one million books available in our catalogue for you to explore.

Information

Year
2018
ISBN
9781788478755
Edition
1

Object Detection and Machine Learning

In this chapter, we will cover the following recipes:
  • Obtaining an object mask using the GrabCut algorithm
  • Finding edges using the Canny algorithm
  • Detecting lines and circles using the Hough transform
  • Finding objects via template matching
  • The real-time median-flow object tracker
  • Tracking objects using different algorithms via the tracking API
  • Computing the dense optical flow between two frames
  • Detecting chessboard and circle grid patterns
  • A simple pedestrian detector using the SVM model
  • Optical character recognition using different machine learning models
  • Detecting faces using Haar/LBP cascades
  • Detecting AruCo patterns for AR applications
  • Detecting text in natural scenes
  • The QR code detector and recognizer

Introduction

Our world contains a lot of objects. Each type of object has its own features that distinguish it from some types and, at the same time, make it similar to others. Understanding the scene through the objects in it is a key task in computer vision. Being able to find and track various objects, detect basic patterns and complex structures, and recognize text are challenging and useful skills, and this chapter addresses questions on how to implement and use them with OpenCV functionality.
We will review the detection of geometric primitives, such as lines, circles, and chessboards, and more complex objects, such as pedestrians, faces, AruCo, and QR code patterns. We will also perform object tracking tasks.

Obtaining an object mask using the GrabCut algorithm

There are cases where we want to separate an object from other parts of a scene; in other words, where we want to create masks for the foreground and background. This job is tackled by the GrabCut algorithm. It can build object masks in semi-automatic mode. All that it needs are initial assumptions about object location. Based on these assumptions, the algorithm performs a multi-step iterative procedure to model statistical distributions of foreground and background pixels and find the best division according to the distributions. This sounds complicated, but the usage is very simple. Let's find out how easily we can apply this sophisticated algorithm in OpenCV.

Getting ready

Before you proceed with this recipe, you need to install the OpenCV 3.x Python API package.

How to do it...

  1. Import the modules:
import cv2
import numpy as np
  1. Open an image and define the mouse callback function to draw a rectangle on the image:
img = cv2.imread('../data/Lena.png', cv2.IMREAD_COLOR)
show_img = np.copy(img)

mouse_pressed = False
y = x = w = h = 0

def mouse_callback(event, _x, _y, flags, param):
global show_img, x, y, w, h, mouse_pressed

if event == cv2.EVENT_LBUTTONDOWN:
mouse_pressed = True
x, y = _x, _y
show_img = np.copy(img)

elif event == cv2.EVENT_MOUSEMOVE:
if mouse_pressed:
show_img = np.copy(img)
cv2.rectangle(show_img, (x, y),
(_x, _y), (0, 255, 0), 3)

elif event == cv2.EVENT_LBUTTONUP:
mouse_pressed = False
w, h = _x - x, _y - y
  1. Display the image, and, after the rectangle has been completed and the A button on the keyboard has been pressed, close the window with the following code:
cv2.namedWindow('image')
cv2.setMouseCallback('image', mouse_callback)

while True:
cv2.imshow('image', show_img)
k = cv2.waitKey(1)

if k == ord('a') and not mouse_pressed:
if w*h > 0:
break

cv2.destroyAllWindows()
  1. Call cv2.grabCut to create an object mask based on the rectangle that was drawn. Then, create the object mask and define it as:
labels = np.zeros(img.shape[:2],np.uint8)

labels, bgdModel, fgdModel = cv2.grabCut(img, labels, (x, y, w, h), None, None, 5, cv2.GC_INIT_WITH_RECT)

show_img = np.copy(img)
show_img[(labels == cv2.GC_PR_BGD)|(labels == cv2.GC_BGD)] //= 3

cv2.imshow('image', show_img)
cv2.waitKey()
cv2.destroyAllWindows()
  1. Define the mouse callback to draw the mask on the image. It's necessary to repair mistakes in the previous cv2.grabCut call:
label = cv2.GC_BGD
lbl_clrs = {cv2.GC_BGD: (0,0,0), cv2.GC_FGD: (255,255,255)}

def mouse_callback(event, x, y, flags, param):
global mouse_pressed

if event == cv2.EVENT_LBUTTONDOWN:
mouse_pressed = True
cv2.circle(labels, (x, y), 5, label, cv2.FILLED)
cv2.circle(show_img, (x, y), 5, lbl_clrs[label], cv2.FILLED)

elif event == cv2.EVENT_MOUSEMOVE:
if mouse_pressed:
cv2.circle(labels, (x, y), 5, label, cv2.FILLED)
cv2.circle(show_img, (x, y), 5, lbl_clrs[label], cv2.FILLED)

elif event == cv2.EVENT_LBUTTONUP:
mouse_pressed = False
  1. Show the image with the mask; use white to draw where the object pixels have been labeled as a background, and use black to draw where the background areas have been marked as belonging to the object. Then, call cv2.grabCut again to get the fixed mask. Finally, update the mask on the image, and show it:
cv2.namedWindow('image')
cv2.setMouseCallback('image', mouse_callback)

while True:
cv2.imshow('image', show_img)
k = cv2.waitKey(1)

if k == ord('a') and not mouse_pressed:
break
elif k == ord('l'):
label = cv2.GC_FGD - label

cv2.destroyAllWindows()

labels, bgdModel, fgdModel = cv2.grabCut(img, labels, None, bgdModel, fgdModel, 5, cv2.GC_INIT_WITH_MASK)

show_img = np.copy(img)
show_img[(labels == cv2.GC_PR_BGD)|(labels == cv2.GC_BGD)] //= 3

cv2.imshow('image', show_img)
cv2.waitKey()
cv2.destroyAllWindows()

How it works...

OpenCV's cv2.grabCut implements the GrabCut algorithm. This function is able to work in several modes, and takes the following arguments: input 3-channel image, a matrix with initial labels for pixels, a rectangle in (x, y, w, h) format to define label initialization, two matrices to store the process state, a number of iterations, and the mode in which we want the function to launch.
The function returns labels matrix and two matrices with the state of the process. The labels matrix is single-channel, and it stores one of these values in each pixel: cv2.GC_BGD (this means that the pixel definitely belongs to the background), cv2.GC_PR_BGD (this means that the pixel is probably in the background), cv2.GC_PR_FGD (for pixels which are possibly foreground), cv2.GC_FGD (for pixels which are definitely foreground). The two state matrices are necessary if we want to continue the process for a few iterations.
There are three possible modes for the function: cv2.GC...

Table of contents

  1. Title Page
  2. Copyright and Credits
  3. Packt Upsell
  4. Contributors
  5. Preface
  6. I/O and GUI
  7. Matrices, Colors, and Filters
  8. Contours and Segmentation
  9. Object Detection and Machine Learning
  10. Deep Learning
  11. Linear Algebra
  12. Detectors and Descriptors
  13. Image and Video Processing
  14. Multiple View Geometry
  15. Other Books You May Enjoy