GRT is an extremely general model of perception and decision making that has been applied to a great variety of different tasks and behaviors. Nevertheless, it makes some core assumptions that are assumed to hold in all applications. In particular, it assumes two separate stages of processing: a sensory/perceptual stage that generally precedes a second decision stage, and it also assumes that all sensory representations are inherently noisy, that every behavior, no matter how trivial, requires a decision, and that decision processes can be modeled via a decision bound. All but the last of these assumptions also define univariate signal detection theory.
When GRT was first proposed nearly 30 years ago, the only one of these assumptions with any independent support was that all sensory representations are noisy. The other assumptions were justified almost exclusively on the basis of intuitive appeal. During the intervening three decades, however, an explosion of new neuroscience knowledge has provided strong tests of all GRT assumptions. This section reviews these assumptions and the relevant neuroscience data. As we will see, for the most part, neuroscience has solidified the foundations of GRT.
1.2.1 Separate Sensory and Decision Processes
GRT and signal detection theory both assume that decision processes act on a percept that may depend on the nature of the task, but does not depend on the actual response that is made. GRT can be applied to virtually any task. However, the most common applications are to tasks where the stimuli vary on two stimulus components or dimensions. Call these A and B. Then a common practice is to let AiBj denote the stimulus in which component A is at level i and component B is at level j. GRT models the sensory or perceptual effects of stimulus AiBj via the joint probability density function (pdf) fij(x1,x2). On any particular trial when stimulus AiBj is presented, GRT assumes that the subjectâs percept can be modeled as a random sample from this joint pdf and that the subject uses these values to select a response. Thus the pdf fij(x1,x2) is the same on every trial that stimulus AiBj is presented, regardless of what response was made or, in other words, GRT assumes that a sensory representation is formed first, and then decision processes use this representation to select a response. A possible neural implementation of the theory would require that the neural networks used to represent the perceptual properties of stimuli are relatively separated from the networks used to set criteria for decision making (which determine response biases).
Even 30 years ago, it was known that the flow of information from sensory receptors up through the brain, passes through sensory cortical regions before reaching motor areas that initiate behaviors. In the case of vision, for example, it was known that retinal ganglion cells project from the retina to the lateral geniculate nucleus of the thalamus, which projects to V1 (primary visual cortex) then V2, V4, and many other regions before the M1 (primary motor cortex) neurons are stimulated, which cause the subject to press one response key or the other. So the general neuroanatomy seemed consistent with the GRT assumption of separate sensory and decision processes. Even so, there was virtually nothing known about whether visual cortex plays a significant role in the decision process. For example, in 1986, knowledge of neuroanatomy was also consistent with a theory in which the response was actually selected as the representation moved up through higher levels of visual cortex, and the main goal of the processing that occurred in many psychophysical tasks in later non-visual areas (e.g., premotor cortex and M1) was simply to serve as a relay between visual areas and the effectors that would execute the selected behavior. This type of intermingling would violate the GRT assumption that sensory/perceptual and decision processes are relatively separate.
In fact, there is now good evidence that decisions are not mediated within the visual cortex. For a while, however, evidence against the GRT assumption of separate sensory and decision processes seemed strong. The most damning evidence came from reports of a variety of category-specific agnosias that result from lesions in inferotemporal cortex (IT) and other high-level visual areas. Category-specific agnosia refers to the ability to perceive or categorize most visual stimuli normally but with a reduced ability to recognize exemplars from some specific category, such as inanimate objects (e.g., tools or fruits). The most widely known of such deficits, which occur with human faces (i.e., prosopagnosia), is associated with lesions to the fusiform gyrus in IT. In GRT, a category is defined by a response region, not by a perceptual distribution. So the association of category-specific agnosias to lesions in visual cortex seemed to suggest that the visual areas were also learning the decision bounds that defined the categories.
Of course, a category-specific agnosia that results from an IT lesion does not logically imply that category representations are stored in IT. For example, although such agnosias are consistent with the hypothesis that category learning occurs in IT, they are also generally consistent with the hypothesis that visually similar objects are represented in nearby areas of visual cortex. In particular, it is well known that neighboring neurons in IT tend to fire to similar stimuli.
Take the example of the most anterior IT region in the monkey brain: area TE. This is the final stage of purely visual processing in the primate brain; thus if high-level categorical representations were stored in visual cortex, TE would be a likely place for their storage. Research indicates that most neurons in this area are maximally activated by moderately complex shapes or object parts (for reviews, see Tanaka, 1996, 2004). More specifically, they are maximally activated by features that are more complex than simple edges or textures, but not complex enough to represent a whole natural object or object category. Because neurons in TE are selective to partial object features, the representation of a whole object requires the combined activation of at least several of these neurons. In other words, anterior IT seems to code for objects in a sparsely distributed manner (Rolls, 2009; E. Thomas, Van Hulle, & Vogels, 2001), which is confirmed by analyses showing that the way in which information about a stimulus increases with the number of IT neurons that are sampled is in line with a sparsely distributed code (Abbott, Rolls, & Tovee, 1996; Hung, Kreiman, Poggio, & DiCarlo, 2005; Rolls, Treves, & Tovee, 1997). It appears that TE neurons that code for similar features cluster in columns (Fujita, Tanaka, Ito, & Cheng, 1992), that a single object activates neurons in several columns (Wang, Tanifuji, & Tanaka, 1998; Yamane, Tsunoda, Matsumoto, Phillips, & Tanifuji, 2006), and that the columns that are activated by two similar objects represent features that are common to both (Tsunoda, Yamane, Nishizaki, & Tanifuji, 2001). Thus damage to some contiguous region of IT (or any other visual cortical area) is likely to lead to perception deficits within a class of similar stimuli due to their shared perceptual features.
In fact, there is also now strong evidence that decision processes are not implemented within visual cortex. For example, Rolls, Judge, and Sanghera (1977) made recordings from neurons in the ITs of monkeys. In these experiments, one visual stimulus was associated with reward and one with a mildly aversive taste. After training, the rewards were switched. Thus, in effect, the animals were taught two simple categories (i.e., âgoodâ and âbadâ), and then the category assignments were switched. If the categorical decision was represented in the visual cortex, then the firing properties of visual cortical neurons should have changed when the category memberships were switched. However, Rolls et al. found no change in the response of any of these cortical neurons, although other similar studies have found changes in the responses of neurons in other downstream brain areas (e.g., orbitofrontal cortex).
More recent studies have found similar null results with more traditional categorization tasks (Freedman, Riesenhuber, Poggio, & Miller, 2003; Op de Beeck, Wagemans, & Vogels, 2001; Sigala, 2004; E. Thomas et al., 2001; Vogels, 1999). In each of these studies, monkeys were taught to classify visual objects into one of two categories (e.g., tree versus non-tree, two categories of arbitrary complex shapes). Single-cell recordings showed that the firing properties of IT neurons did not change with learning. In particular, IT neurons showed sensitivity to specific visual images, but category training did not make them more likely to respond to other stimuli in the same category or less likely to respond to stimuli belonging to the contrasting category.
Similar results have been found in neurobiological studies of visual perceptual learning. The standard model of perceptual learning includes an early stage of sensory processing that is separate from a later stage of decision making (Amitay, Zhang, Jones, & Moore, 2014; Law & Gold, 2010). Theories proposing changes in the later decision stage of processing have been particularly successful in accounting for the available data (Amitay et al., 2014), including findings of heightened behavioral sensitivity that are not associated with changes in early sensory areas, but instead with the way that sensory information is used to form a decision variable at later stages of processing (e.g., Kahnt, Grueschow, Speck, & Haynes, 2011; Law & Gold, 2008).
On the other hand, under certain conditions, categorization training can change the firing properties of IT neurons. Sigala and Logothetis (2002; see also De Baene, Ons, Wagemans, & Vogels, 2008; Sigala, 2004) trained two monkeys to classify faces into one of two categories and then in a separate condition to classify fish. In both conditions, some stimulus features were diagnostic and some were irrelevant to the categorization response. After categorization training, many neurons in IT showed enhanced sensitivity to the diagnostic features compared to the irrelevant features. Such changes are consistent with the widely held view that category learning is often associated with changes in the allocation of perceptual attention (Nosofsky, 1986). Accounting for such shifts in perceptual attention is straightforward in GRT. The typical approach is to assume that increases in the amount of attention allocated to a perceptual dimension reduces perceptual variance on that dimension (Maddox, et al., 2002; Soto et al., 2014).
Changes in the selectivity of IT neurons after categorization training are consistent with the hypothesis that category learning is mediated outside the visual system and that the attentional effects of categorization training are propagated back to visual areas through feedback projections (see Gilbert & Sigman, 2007; Kastner & Ungerleider, 2000). In support of this hypothesis, the effect of category learning on neural responses is stronger in non-visual areas, such as the striatum and prefrontal cortex (PFC), than in IT (De Baene et al., 2008; Seger & Miller, 2010). Simultaneous recordings from PFC and IT neurons during category learning show that, although IT neurons change their firing after learning, the changes are weaker than i...