Image is a widely used concept. People generally regard an image as a visual representation of a scene or scenery. For example, the definition of image in the dictionary is “the expression, representation, and imitation of an object, a vivid visual description, something introduced to express other things” (Bow 2002). Strictly, images are obtained by observing the objective world in different forms and methods with various observation systems, which can directly or indirectly act on the human eye and produce visual perception entities (Zhang 1996). The human visual system is an observation system, and the image obtained through it is the image formed by the objective scene in the human mind.
The image contains a wealth of information. We live in an information age. Scientific research and statistics show that about 75% of the information that humans obtain from the outside world comes from the visual system, that is, obtained from images. The concept of image here is relatively broad, including photos, drawings, animations, videos, and even documents. There is an old saying in China, “It is better to see than to hear a hundred times.” People often say, “A picture is worth a thousand words.” All these show that the information contained in the image is very rich, and image is our main source of information.
This book mainly discusses images obtained by imaging natural scenes, which also have many categories. For example, photos (people, landscapes, etc.) taken with digital cameras, videos (family parties, football games, etc.) captured with digital video cameras, various sequences (traffic management, missile flight, etc.) recorded by surveillance systems, various electromagnetic radiation images captured by space telescopes, and the images formed by radar based on reflected waves, as well as X-ray images, B-ultrasound images, CT images, and magnetic resonance images (MRI) that are commonly used in medicine. There are not only grayscale and color images but also texture and depth images.
In recent years, images have been widely used in many fields of social development and human life, such as industrial production, smart agriculture, biomedicine and health, leisure and entertainment, video communication, network communication, document management, remote sensing mapping, environmental protection, intelligent transportation, military and public security, space exploration, and so on.
In view of the application characteristics of images in different fields, many image technologies covering a wide range have been studied. This book attempts to select some basic categories of the very commonly used image technology and gives an introduction from shallow to deep (including basic principles, practical technologies, and development trends).
The contents of each section of this chapter are arranged as follows.
Section 1.1 gives a general introduction to the basic knowledge of images. It includes the representation method and display method of image and pixel, the relationship between image quality and spatial resolution and/or amplitude resolution, as well as the half-tone technology and dithering technology commonly used in image printout. This section provides some fundamental knowledge about image.
Section 1.2 provides an overview of image technology. The overall framework – image engineering (IE), as well as its three levels – image processing (IP), image analysis (IA), image understanding (IU), are introduced first. Then, the image system block diagram and some modules in this system are discussed. This section makes it possible for the following chapter to be focused on IP that is mainstream of this book.
Section 1.3 discusses the features of this book. It elaborates and analyzes the three aspects of writing motivation, material selection and contents, as well as structure and arrangement. It not only gives the overall content and structural characteristics of the book but also helps the readers to know how to learn and use this book.
1.1 Image Basics
First, some basic concepts and terminology related to images are reviewed.
1.1.1 Image Representation and Display
Let’s first introduce how to represent and display images.
1.1.1.1 Images and Pixels
The objective world is three-dimensional (3-D) in space, but the image obtained from the objective scene is generally two-dimensional (2-D). An image can be represented by a 2-D array f(x, y), where x and y represent the position of a coordinate point in the 2-D space XY, and f represents the image value of a property F at a certain point (x, y). For example, f in a grayscale image represents a gray value, which often corresponds to the observed brightness of an objective scene. Text images are often binary images, and there are only two values for f, corresponding to text and blank space, respectively. The image at the point (x, y) can also have multiple properties at the same time. In this case, it can be represented by a vector f. For example, a color image has three values of red, green, and blue at each image point, which can be recorded as [fr(x, y), fg(x, y), fb(x, y)]. It needs to be pointed out that people always use images according to the different properties at different positions in the image.
An image can represent the spatial distribution of radiant energy. This distribution can be a function of five variables T(x, y, z, t, λ), where x, y, and z are spatial variables, and t represents time variables, λ is wavelength (corresponding to the spectral variable). For example, a red object reflects light with a wavelength of 0.57–0.78 μm and absorbs almost all energy of other wavelengths; a green object reflects light with a wavelength of 0.48–0.57 μm; a blue object reflects light with a wavelength of 0.40–0.48 μm. Ultraviolet (color) objects reflect light with a wavelength of 0.25–0.40 μm, and infrared (color) objects reflect light with a wavelength of 0.78–1.5 μm. Together, they cover a wavelength range of 0.25–1.5 μm. Since the actual image is finite in time and space, T(x, y, z, t, λ) is a 5-D finite function.
The images acquired in the early years are mostly continuous (analog), that is, the values of f, x, and y can be any real numbers. With the invention of the computer and the development of electronic equipment, the acquired images are all discrete (digital) and can be processed directly by the computer. Someone once used I(r, c) to represent a digital image, where the values of I, r, and c are all integers. Here I represents the discretized f; (r, c) represents the discretized (x, y), where r represents the image row, and c represents the image column. The discussion in this book is related to digital images. Images or f(x, y) are used to represent digital images without causing confusion. Unless otherwise specified, f, x, and y are all taken their values in the integer set.
In the early days, the term “picture” was generally used to refer to images. With the development of digital technology, the term “image” is now used to represent a discretized “image” because “computers store numerical images of a picture or scene” (Zhang 1996). Each basic unit in an image is called an image element, and in the early days, when the “picture” was used to represent an image, it was called a pixel. For 2-D images, “pel” has also been used to refer to the basic unit. If one collects a ...