Chapter 1
Practical Aspects of a Vision SystemâImage Display, Input/Output, and Library Calls
When experimenting with vision- and image-analysis systems or implementing one for a practical purpose, a basic software infrastructure is essential. Images consist of pixels, and in a typical image from a digital camera there will be 4â6 million pixels, each representing the color at a point in the image. This large amount of data is stored as a file in a format (such as GIF or JPEG) suitable for manipulation by commercial software packages, such as Photoshop and Paint. Developing new image-analysis software means first being able to read these files into an internal form that allows access to the pixel values. There is nothing exciting about code that does this, and it does not involve any actual image processing, but it is an essential first step. Similarly, image-analysis software will need to display images on the screen and save them in standard formats. It's probably useful to have a facility for image capture available, too. None of these operations modify an image but simply move it about in useful ways.
These bookkeeping tasks can require most of the code involved in an imaging program. The procedure for changing all red pixels to yellow, for example, can contain as few as 10 lines of code; yet, the program needed to read the image, display it, and output of the result may require an additional 2,000 lines of code, or even more.
Of course, this infrastructure code (which can be thought of as an application programming interface, or API) can be used for all applications; so, once it is developed, the API can be used without change until updates are required. Changes in the operating system, in underlying libraries, or in additional functionalities can require new versions of the API. If properly done, these new versions will require little or no modification to the vision programs that depend on it. Such an API is the OpenCV system.
1.1 OpenCV
OpenCV was originally developed by Intel. At the time of this writing, version 2.0 is current and can be downloaded from http://sourceforge.net/projects/opencvlibrary/.
However, Version 2.0 is relatively new, yet it does not install and compile with all of the major systems and compilers. All the examples in this book use Version 1.1 from http://sourceforge.net/projects/opencvlibrary/files/opencv-win/1.1pre1/OpenCV_1.1pre1a.exe/download, and compile with the Microsoft Visual C++ 2008 Express Edition, which can be downloaded from www.microsoft.com/express/Downloads/#2008-Visual-CPP.
The Algorithms for Image Processing and Computer Vision website (www.wiley.com/go/jrparker) will maintain current links to new versions of these tools. The website shows how to install both the compiler and OpenCV. The advantage of using this combination of tools is that they are still pretty current, they work, and they are free.
1.2 The Basic OpenCV Code
OpenCV is a library of C functions that implement both infrastructure operations and image-processing and vision functions. Developers can, of course, add their own functions into the mix. Thus, any of the code described here can be invoked from a program that uses the OpenCV paradigm, meaning that the methods of this book are available in addition to those of OpenCV. One simply needs to know how to call the library, and what the basic data structures of open CV are.
OpenCV is a large and complex library. To assist everyone in starting to use it, the following is a basic program that can be modified to do almost anything that anyone would want:
// basic.c : A âwrapperâ for basic vision programs.
#include âstdafx.hâ
#include âcv.hâ
#include âhighgui.hâ
int main (int argc, char* argv[])
{
IplImage *image = 0;
image = cvLoadImage(âC:\AIPCV\image1.jpgâ, 1 );
if( image )
{
cvNamedWindow( âInput Imageâ, 1 );
cvShowImage( âInput Imageâ, image );
printf( âPress a key to exit\nâ);
cvWaitKey(0);
cvDestroyWindow(âStringâ);
}
else
fprintf( stderr, âError reading image\nâ );
return 0;
}
This is similar to many example programs on the Internet. It reads in an image (C:\AIPCV\image1.jpg is a string giving the path name of the image) and displays it in a window on the screen. When the user presses a key, the program terminates after destroying the display window.
Before anyone can modify this code in a knowledgeable way, the data structures and functions need to be explained.
1.2.1 The IplImage Data Structure
The IplImage structure is the in-memory data organization for an image. Images in IplImage form can be converted into arrays of pixels, but IplImage also contains a lot of structural information about the image data, which can have many forms. For example, an image read from a GIF file could be 256 gre...