- 306 pages
- English
- ePUB (mobile friendly)
- Available on iOS & Android
OpenCV 3 Computer Vision with Python Cookbook
About This Book
Recipe-based approach to tackle the most common problems in Computer Vision by leveraging the functionality of OpenCV using Python APIsAbout This Bookā¢ Build computer vision applications with OpenCV functionality via Python APIā¢ Get to grips with image processing, multiple view geometry, and machine learningā¢ Learn to use deep learning models for image classification, object detection, and face recognitionWho This Book Is ForThis book is for developers who have a basic knowledge of Python. If you are aware of the basics of OpenCV and are ready to build computer vision systems that are smarter, faster, more complex, and more practical than the competition, then this book is for you.What You Will Learnā¢ Get familiar with low-level image processing methodsā¢ See the common linear algebra tools needed in computer visionā¢ Work with different camera models and epipolar geometryā¢ Find out how to detect interesting points in images and compare themā¢ Binarize images and mask out regions of interestā¢ Detect objects and track them in videosIn DetailOpenCV 3 is a native cross-platform library for computer vision, machine learning, and image processing. OpenCV's convenient high-level APIs hide very powerful internals designed for computational efficiency that can take advantage of multicore and GPU processing. This book will help you tackle increasingly challenging computer vision problems by providing a number of recipes that you can use to improve your applications.In this book, you will learn how to process an image by manipulating pixels and analyze an image using histograms. Then, we'll show you how to apply image filters to enhance image content and exploit the image geometry in order to relay different views of a pictured scene. We'll explore techniques to achieve camera calibration and perform a multiple-view analysis.Later, you'll work on reconstructing a 3D scene from images, converting low-level pixel information to high-level concepts for applications such as object detection and recognition. You'll also discover how to process video from files or cameras and how to detect and track moving objects. Finally, you'll get acquainted with recent approaches in deep learning and neural networks.By the end of the book, you'll be able to apply your skills in OpenCV to create computer vision applications in various domains.Style and approachThis book helps you learn the core concepts of OpenCV faster by taking a recipe-based approach where you can try out different code snippets to understand a concept.
Frequently asked questions
Information
Object Detection and Machine Learning
- Obtaining an object mask using the GrabCut algorithm
- Finding edges using the Canny algorithm
- Detecting lines and circles using the Hough transform
- Finding objects via template matching
- The real-time median-flow object tracker
- Tracking objects using different algorithms via the tracking API
- Computing the dense optical flow between two frames
- Detecting chessboard and circle grid patterns
- A simple pedestrian detector using the SVM model
- Optical character recognition using different machine learning models
- Detecting faces using Haar/LBP cascades
- Detecting AruCo patterns for AR applications
- Detecting text in natural scenes
- The QR code detector and recognizer
Introduction
Obtaining an object mask using the GrabCut algorithm
Getting ready
How to do it...
- Import the modules:
import cv2
import numpy as np
- Open an image and define the mouse callback function to draw a rectangle on the image:
img = cv2.imread('../data/Lena.png', cv2.IMREAD_COLOR)
show_img = np.copy(img)
mouse_pressed = False
y = x = w = h = 0
def mouse_callback(event, _x, _y, flags, param):
global show_img, x, y, w, h, mouse_pressed
if event == cv2.EVENT_LBUTTONDOWN:
mouse_pressed = True
x, y = _x, _y
show_img = np.copy(img)
elif event == cv2.EVENT_MOUSEMOVE:
if mouse_pressed:
show_img = np.copy(img)
cv2.rectangle(show_img, (x, y),
(_x, _y), (0, 255, 0), 3)
elif event == cv2.EVENT_LBUTTONUP:
mouse_pressed = False
w, h = _x - x, _y - y
- Display the image, and, after the rectangle has been completed and the A button on the keyboard has been pressed, close the window with the following code:
cv2.namedWindow('image')
cv2.setMouseCallback('image', mouse_callback)
while True:
cv2.imshow('image', show_img)
k = cv2.waitKey(1)
if k == ord('a') and not mouse_pressed:
if w*h > 0:
break
cv2.destroyAllWindows()
- Call cv2.grabCut to create an object mask based on the rectangle that was drawn. Then, create the object mask and define it as:
labels = np.zeros(img.shape[:2],np.uint8)
labels, bgdModel, fgdModel = cv2.grabCut(img, labels, (x, y, w, h), None, None, 5, cv2.GC_INIT_WITH_RECT)
show_img = np.copy(img)
show_img[(labels == cv2.GC_PR_BGD)|(labels == cv2.GC_BGD)] //= 3
cv2.imshow('image', show_img)
cv2.waitKey()
cv2.destroyAllWindows()
- Define the mouse callback to draw the mask on the image. It's necessary to repair mistakes in the previous cv2.grabCut call:
label = cv2.GC_BGD
lbl_clrs = {cv2.GC_BGD: (0,0,0), cv2.GC_FGD: (255,255,255)}
def mouse_callback(event, x, y, flags, param):
global mouse_pressed
if event == cv2.EVENT_LBUTTONDOWN:
mouse_pressed = True
cv2.circle(labels, (x, y), 5, label, cv2.FILLED)
cv2.circle(show_img, (x, y), 5, lbl_clrs[label], cv2.FILLED)
elif event == cv2.EVENT_MOUSEMOVE:
if mouse_pressed:
cv2.circle(labels, (x, y), 5, label, cv2.FILLED)
cv2.circle(show_img, (x, y), 5, lbl_clrs[label], cv2.FILLED)
elif event == cv2.EVENT_LBUTTONUP:
mouse_pressed = False
- Show the image with the mask; use white to draw where the object pixels have been labeled as a background, and use black to draw where the background areas have been marked as belonging to the object. Then, call cv2.grabCut again to get the fixed mask. Finally, update the mask on the image, and show it:
cv2.namedWindow('image')
cv2.setMouseCallback('image', mouse_callback)
while True:
cv2.imshow('image', show_img)
k = cv2.waitKey(1)
if k == ord('a') and not mouse_pressed:
break
elif k == ord('l'):
label = cv2.GC_FGD - label
cv2.destroyAllWindows()
labels, bgdModel, fgdModel = cv2.grabCut(img, labels, None, bgdModel, fgdModel, 5, cv2.GC_INIT_WITH_MASK)
show_img = np.copy(img)
show_img[(labels == cv2.GC_PR_BGD)|(labels == cv2.GC_BGD)] //= 3
cv2.imshow('image', show_img)
cv2.waitKey()
cv2.destroyAllWindows()
How it works...
Table of contents
- Title Page
- Copyright and Credits
- Packt Upsell
- Contributors
- Preface
- I/O and GUI
- Matrices, Colors, and Filters
- Contours and Segmentation
- Object Detection and Machine Learning
- Deep Learning
- Linear Algebra
- Detectors and Descriptors
- Image and Video Processing
- Multiple View Geometry
- Other Books You May Enjoy