1 Hierarchical Control in the Execution of Action Sequences: Tests of Two Invariance Properties
Saul Sternberg
University of Pennsylvania
Ronald L, Knoll
AT&T Bell Laboratories
David L. Turock
Bell Communications Research
Abstract
What might it mean for execution of an action sequence to be controlled hierarchically? We argue that if production of a sequence consists of the execution of nested constituent subsequences, then it should be characterized by two invariance propertiesāproperties that limit the effects of one part of the sequence on another. Because one such constituent structure merely partitions the stream of action into action units, these properties have wide applicability. According to low-level invariance, the process that executes a constituent should not be influenced by changes in any higher level constituent. According to high-level invariance, changes in a constituent should have at most limited and local effects on higher level constituents. We report on tests of these two properties in the rapid production of brief utterances and short strings of keystrokes, in which we examine the effects of sequence length, serial position, and unit size on measures of timing. The tests support the existence of hierarchical constituents at the level of the stroke in typing and the stress group in speech, but provide only limited evidence for deeper hierarchical structure.
Introduction
In this chapter we investigate one sense in which execution of action sequences might be hierarchical, by virtue of their being composed of separately controlled constituent subsequences. We argue that to claim merely that a continuous stream of action can be decomposed into a string of concatenated units is to assert a simple form of such hierarchical structure; properties that distinguish the constituents in such a structure should thus characterize purported action units. We focus on two invariance properties that flow from this sense of hierarchical controlāproperties that reflect the idea that different control levels function autonomously and thus impose limits on the effects of sequential contextāand report attempts to determine the extent to which these properties characterize timing in the rapid production of speech and keystroke sequences. The spirit of our inquiry is to minimize the number of ancillary assumptions and thus avoid strong models. Our aim is to illustrate some alternative approaches to testing for hierarchical structure, applying them in most cases to data collected for other purposes.
In our examination of evidence we consider instances where short action sequences prescribed well in advance are correctly produced under time pressure, and where time measurements thus indicate performance constraints. We justify this choice of procedure by our desire to separate the execution of planned sequences from the process by which they are planned. Performance measurements indicate that under these conditions a plan or "program" for the entire sequence exists before it is initiated (Rosenbaum, Kenny, & Derr, 1983; Sternberg, Monsell, Knoll, & Wright, 1978). We deliberately do not investigate cases where choice of sequence is free rather than prescribed (e.g., Fentress, 1983), or where errors are of central interest (e.g., Shattuck-Hufnagel, 1983): The choice of action element (when choice is free) or errors in such choice (when the sequence is prescribed) could occur during the planning process as well as the execution process, whereas temporal effects in the execution of prescribed sequences seem more likely to be associated with the latter. Because our concern is with how the control mechanism selects successive actions, we consider primarily sequences whose successive actions are distinct rather than repeating.
Properties of Sequences Under Hierarchical Control
Concepts of Hierarchy
One example of the numerous ways1 in which the term hierarchy has been used is to denote a simple ordering on some dimension, often described as a set of levels. More interesting are tree-like branching structures, consisting of a set of elements (nodes) at different levels, partially ordered by a relation (branches), usually antisymmetric and transitive (Wall, 1972). Examples of relations are inclusion and control. If the relation is inclusion, we have a classification hierarchy in which each class (at one level) consists of a set of subclasses (at the next), and a subclass can belong to only one class. If the relation is control, each element (at one level) controls a set of elements (at the next), and an element can be controlled by at most one higher level element.
In applying this idea to action sequences we are interested in strings of rapid actions that can be said to contain one or more separately controlled substring constituents (or units), larger than a "single" action (that is, susceptible to further analysis), but smaller than the whole string (Gallistel, 1980, pp. 288-290). The validity of such an assertion hinges theoretically on the specification of criteria that define a substring as a constituent, and empirically on the demonstration that such criteria are satisfied. This sense of hierarchy should be distinguished from the idea of different levels of specificity, such that more detailed aspects of the same action are controlled autonomously at lower levels, closer to its execution (Szentagothai & Arbib, 1975), and where control branches can converge and cross (Gallistel, 1980).
Perhaps the most familiar domain where hierarchically organized sequences have been used to characterize human behavior is language. The constituent structure of sentences is represented in linguistics by the recursively branching phrase marker (Wall, 1972), which we introduce to add precision to the idea of hierarchical structure. The phrase marker can be expressed as either an ordered tree, (Fig. 1.1A), or a labeled-bracket structure (Fig. 1.1B). An ordered tree consists of a set of nodes (from a root at the highest level to leaves at the lowest, or first) connected by diverging and non-crossing branches. Any pair of nodes is related either by dominance (for instance, in Fig. 1.1 VP dominates the right-hand N, and is thus at a higher "level") or by precedence (for instance, the upper NP precedes V) but not by both. Each node corresponds to a substring constituent of the sentence. The nesting of constituents is more evident in the labeled-bracket structure, in which dominance becomes inclusion.
A hierarchically organized string may thus be defined as a set of constituent (sub)strings, partially ordered by inclusion, such that any string either fully contains, or is fully contained in, or is disjoint from any other string, and such that disjoint constituents are temporally ordered. A string of "action unit" substrings defines the most shallow structure that can be described as hierarchical; in a deeper hierarchy the substrings would be further partitioned into smaller disjoint substrings.
FIG. 1.1. Two forms of the phrase marker representation of a sentence. A: An ordered tree composed of nodes and branches. B: A partially equivalent structure of nested labeled brackets.
Two Invariance Properties of Hierarchically Controlled Sequences
Augmenting of hierarchical structure with hierarchical control. The phrase marker is a hierarchical structure with no commitment to any particular embodiment in real time or to any specification of the flow of c...