Synthetic Vision
eBook - ePub

Synthetic Vision

Using Volume Learning and Visual DNA

Scott Krig

  1. 368 pagine
  2. English
  3. ePUB (disponibile sull'app)
  4. Disponibile su iOS e Android
eBook - ePub

Synthetic Vision

Using Volume Learning and Visual DNA

Scott Krig

Dettagli del libro
Anteprima del libro
Indice dei contenuti
Citazioni

Informazioni sul libro

In Synthetic Vision: Using Volume Learning and Visual DNA, a holistic model of the human visual system is developed into a working model in C++, informed by the latest neuroscience, DNN, and computer vision research. The author's synthetic visual pathway model includes the eye, LGN, visual cortex, and the high level PFC learning centers. The corresponding visual genome model (VGM), begun in 2014, is introduced herein as the basis for a visual genome project analogous to the Human Genome Project funded by the US government. The VGM introduces volume learning principles and Visual DNA (VDNA) taking a multivariate approach beyond deep neural networks. Volume learning is modeled as programmable learning and reasoning agents, providing rich methods for structured agent classification networks. Volume learning incorporates a massive volume of multivariate features in various data space projections, collected into strands of Visual DNA, analogous to human DNA genes. VGM lays a foundation for a visual genome project to sequence VDNA as visual genomes in a public database, using collaborative research to move synthetic vision science forward and enable new applications. Bibliographical references are provided to key neuroscience, computer vision, and deep learning research, which form the basis for the biologically plausible VGM model and the synthetic visual pathway. The book also includes graphical illustrations and C++ API reference materials to enable VGM application programming. Open source code licenses are available for engineers and scientists.

Scott Krig founded Krig Research to provide some of the world's first vision and imaging systems worldwide for military, industry, government, and academic use. Krig has worked for major corporations and startups in the areas of machine learning, computer vision, imaging, graphics, robotics and automation, computer security and cryptography. He has authored international patents in the areas of computer architecture, communications, computer security, digital imaging, and computer vision, and studied at Stanford. Scott Krig is the author of the English/Chinese Springer book Computer Vision Metrics, Survey, Taxonomy and Analysis of Computer Vision, Visual Neuroscience, and Deep Learning, Textbook Edition, as well as other books, articles, and papers.

Domande frequenti

Come faccio ad annullare l'abbonamento?
È semplicissimo: basta accedere alla sezione Account nelle Impostazioni e cliccare su "Annulla abbonamento". Dopo la cancellazione, l'abbonamento rimarrà attivo per il periodo rimanente già pagato. Per maggiori informazioni, clicca qui
È possibile scaricare libri? Se sì, come?
Al momento è possibile scaricare tramite l'app tutti i nostri libri ePub mobile-friendly. Anche la maggior parte dei nostri PDF è scaricabile e stiamo lavorando per rendere disponibile quanto prima il download di tutti gli altri file. Per maggiori informazioni, clicca qui
Che differenza c'è tra i piani?
Entrambi i piani ti danno accesso illimitato alla libreria e a tutte le funzionalità di Perlego. Le uniche differenze sono il prezzo e il periodo di abbonamento: con il piano annuale risparmierai circa il 30% rispetto a 12 rate con quello mensile.
Cos'è Perlego?
Perlego è un servizio di abbonamento a testi accademici, che ti permette di accedere a un'intera libreria online a un prezzo inferiore rispetto a quello che pagheresti per acquistare un singolo libro al mese. Con oltre 1 milione di testi suddivisi in più di 1.000 categorie, troverai sicuramente ciò che fa per te! Per maggiori informazioni, clicca qui.
Perlego supporta la sintesi vocale?
Cerca l'icona Sintesi vocale nel prossimo libro che leggerai per verificare se è possibile riprodurre l'audio. Questo strumento permette di leggere il testo a voce alta, evidenziandolo man mano che la lettura procede. Puoi aumentare o diminuire la velocità della sintesi vocale, oppure sospendere la riproduzione. Per maggiori informazioni, clicca qui.
Synthetic Vision è disponibile online in formato PDF/ePub?
Sì, puoi accedere a Synthetic Vision di Scott Krig in formato PDF e/o ePub, così come ad altri libri molto apprezzati nelle sezioni relative a Informatik e Computer Vision & Mustererkennung. Scopri oltre 1 milione di libri disponibili nel nostro catalogo.

Informazioni

Editore
De|G Press
Anno
2018
ISBN
9781501506291

Chapter 1
Synthetic Vision Using Volume Learning and Visual DNA

Whence arises all that order and beauty we see in the world?
―Isaac Newton

Overview

Imagine a synthetic vision model, with a large photographic memory, that learns all the separate features in each image it has ever seen, which can recall features on demand and search for similarities and differences, and learn continually. Imagine all the possible applications, which can grow and learn over time, only limited by available storage and compute power. This book describes such a model.
This is a technical and visionary book, not a how-to book, describing a working synthetic vision model of the human visual system, based on the best neuroscience research, combined with artificial intelligence (AI), deep learning, and computer vision methods. References to the literature are cited throughout the book, allowing the reader to dig deeper into key topics. As a technical book, math concepts, equations, and some code snippets are liberally inserted describing key algorithms and architecture.
In a nutshell, the synthetic vision model divides each image scene into logical parts similar to multidimensional puzzle pieces, and each part is described by about 16,000 different Visual DNA metrics (VDNA) within a volume feature space. VDNA are associated into strands—like visual DNA genes—to represent higher-level objects. Everything is stored in the photographic visual memory, nothing is lost. A growing set of learning agents continually retrain on new and old VDNA to increase visual knowledge.
The current model is a first step, still growing and learning. Test results are included, showing the potential of the synthetic vision model. This book does not make any claims for fitness for a particular purpose. Rather, it points to a future when synthetic vision is a commodity, together with synthetic eyes, ears, and other intelligent life.
You will be challenged to wonder how the human visual system works and perhaps find ways to criticize and improve the model presented in this book. That’s all good, since this book is intended as a starting point—to propose a visual genome project to allow for collaborative research to move the synthetic model forward. The visual genome project proposed herein allows for open source code development and joint research, as well as commercial development spin-offs, similar to the Human Genome Project funded by the US government, which motivates this work. The visual genome project will catalog Visual DNA on a massive scale, encouraging collaborative research and commercial application development for sponsors and partners.
Be prepared to learn new terminology and concepts, since this book breaks new ground in the area of computer vision and visual learning. New terminology is introduced as needed to describe the concepts of the synthetic vision model. Here are some of the key new terms and concepts herein:
Synthetic vision: A complete model of the human visual system; not just image processing, deep learning, or computer vision, but rather a complete model of the visual pathway and learning centers (see Figure 1.1). In the 1960s, aerospace and defense companies developed advanced cockpit control systems described as synthetic vision systems, including flight controls, target tracking, fire controls, and flight automation. But, this work is not directed toward aerospace or defense (but could be applied there); rather, we focus on modeling the human visual pathway.
Volume learning: No single visual feature is a panacea for all applications, so we use a multivariate, multidimensional feature volume, not just a deep hierarchy of monovariate features as in DNN (Deep Neural Networks) gradient weights. The synthetic model currently uses over 16,000 different types of features. For example, a DNN uses monovariate feature weights representing edge gradients, built up using 3×3 or n×n kernels from a training set of images. Other computer vision methods use trained feature descriptors such as the scale-invariant feature transform (SIFT), or basis functions such as Fourier features and Haar wavelets.
Visual DNA: We use human DNA strands as the model and inspiration to organize visual features into higher-level objects. Human DNA is composed of four bases: (A) Adenine, (T) Thymine, (G) Guanine, and (C) Cytosine, combined in a single strand, divided into genes containing related DNA bases. Likewise, we represent the volume of multivariate features as strands of visual DNA (VDNA) across several bases, such as (C) Color, (T) Texture, (S) Shape, and (G) Glyphs, including icons, motifs, and other small complex local features.
Many other concepts and terminology are introduced throughout this work, as we break new ground and push the boundaries of visual system modeling. So enjoy the journey through this book, take time to wonder how the human visual system works, and hopefully add value to your expertise along the way.
Figure 1.1: The synthetic visual pathway model. The model is composed of (1) an eye/LGN model for optical and early vision processing, (2) a memory model containing groups of related features emulating neural clusters of VDNA found in the visual cortex processing centers V1-V4-Vn, and (3) a learning and reasoning model using agents to perform high-level visual reasoning. Agents create top-level classifiers using a set of multivariate MCC classifiers (discussed in Chapters 4–6).

Synthetic Visual Pathway Model

The synthetic vision model, shown in Figure 1.1, includes a biologically plausible Eye/LGN Model for early vision processing and image assembly as discussed in Chapter 2, a Memory Model that includes several regional processing centers with local memory for specific types of features and objects discussed in Chapter 4, and a Learning/Reasoning Model composed of agents that learn and reason as discussed in Chapter 4. Synthetic Neural Clusters are discussed in Chapter 6, to represent a group of low-level edges describing objects in multiple color spaces and metric spaces, following the standard Hubel and Weisel theories [1] that inspired DNNs. The neural clusters allow low-level synthetic neural concepts to be compared. See Figures 1.2, 1.3, and 1.4 as we go.
In this chapter, we provide several overview sections describing the current synthetic visual pathway model, with model section details following in subsequent Chapters 211. The resulting volume of features and visual learning agents residing within the synthetic model are referred to as the visual genome model (VGM). Furthermore, we propose and discuss a visual genome project enabled by the VGM to form a common basis for collaborative research to move synthetic vision science forward and enable new applications, as well as spin off new commercial products, discussed in Chapter 12.
We note that the visual genome model and corresponding project described herein is unrelated to the work of Krishna et al. [163][164] who use crowd-source volunteers to create a large multilabeled training set of annotated image features, which they call a visual genome.
Some computer vision practitioners may compare the VGM discussed in this book to earlier research such as parts models or bag-of-feature models [1]; however the VGM is out of necessity much richer in order to emulate the visual pathway. The fundamental concepts of the VGM are based on the background research in the author’s prior book, Computer Vision Metrics: Survey, Taxonomy, and Analysis of Computer Vision, Visual Neuroscience, and Deep Learning, Textbook Edition (Springer-Verlag, 2016), which includes nearly 900 references into the literature. The reader is urged to have a copy on hand.

Visual Genome Model

The VGM is shown in Figure 1.2 and Figure 1.4; it consists of a hierarchy of multivariate feature types (Magno, Parvo, Strands, Bundles) and is much more holistic than existing feature models in the literature (see the survey in [1]). The microlevel features are referred to as visual DNA or VDNA. Each VDNA is a feature metric, described in Chapters 4–10. At the bottom level is the eye and LGN model, described in Chapters 2 and 3. At the top level is the learned intelligence—agents—as discussed in Chapter 4. Each agent is a proxy for a specific type of learned intelligence, and the number of agents is not limited, unlike most computer vision systems which rely on a single trained classifier. Agents evaluate the visual features during training, learning, and reasoning. Agents can cooperate and learn continually, as described in Chapter 4.
Visual genomes are composed together from the lower level VDNA features into sequences (i.e. VDNA strands and bundles of strands), similar to a visual DNA chain, to represent higher-level concepts. Note that as shown in Figure 1.2, some of the features (discussed in Chapter 6) are stored as content-addressable memory (CAM) as neural clusters residing in an associative memory space, allowing for a wide range of feature associations. The feature model including the LGN magno and parvo features is discussed in detail in Chapters 2, 3, and 4.
Figure 1.2: The hierarchical visual genome model (VGM). Illustration Copyright © Springer International Publishing 2016. Used by permission (see [166]).
The VGM follows the neurobiolo...

Indice dei contenuti

  1. Cover
  2. Title Page
  3. Copyright
  4. Contents
  5. Chapter 1: Synthetic Vision Using Volume Learning and Visual DNA
  6. Chapter 2: Eye/LGN Model
  7. Chapter 3: Memory Model and Visual Cortex
  8. Chapter 4: Learning and Reasoning Agents
  9. Chapter 5: VGM Platform Overview
  10. Chapter 6: Volume Projection Metrics
  11. Chapter 7: Color 2D Region Metrics
  12. Chapter 8: Shape Metrics
  13. Chapter 9: Texture Metrics
  14. Chapter 10: Region Glyph Metrics
  15. Chapter 11: Applications, Training, Results
  16. Chapter 12: Visual Genome Project
  17. Bibliography
  18. Index
Stili delle citazioni per Synthetic Vision

APA 6 Citation

Krig, S. (2018). Synthetic Vision (1st ed.). De|G Press. Retrieved from https://www.perlego.com/book/830956/synthetic-vision-using-volume-learning-and-visual-dna-pdf (Original work published 2018)

Chicago Citation

Krig, Scott. (2018) 2018. Synthetic Vision. 1st ed. De|G Press. https://www.perlego.com/book/830956/synthetic-vision-using-volume-learning-and-visual-dna-pdf.

Harvard Citation

Krig, S. (2018) Synthetic Vision. 1st edn. De|G Press. Available at: https://www.perlego.com/book/830956/synthetic-vision-using-volume-learning-and-visual-dna-pdf (Accessed: 14 October 2022).

MLA 7 Citation

Krig, Scott. Synthetic Vision. 1st ed. De|G Press, 2018. Web. 14 Oct. 2022.