On the need for a working memory in the information-processing approach to cognition
At the turn of the 1950s, it became evident that behaviour could not entirely be described and explained in terms of stimulus-response links. As Tolman (1948) demonstrated, even rats in mazes behaved as problem solvers, using cognitive maps instead of memorised sequences of actions. This strongly suggested that a level of mental representation might be hypothesised in accounting for a large variety of cognitive phenomena ranging from perception to formal reasoning. As Gardner (1985) claimed, demonstrating the validity of this approach was the major accomplishment of cognitive science. It is usually assumed that a first and decisive step in this revolution was the introduction by Shannon and Turing, both in 1950, of the idea that machines could authentically think. Shannon (1950) anticipated that it should be possible to conceive of electronic machines playing a very difficult game like chess, and thus thinking, while Turing (1950) imagined that such a machine might produce responses that could not be distinguished by any interrogator from those produced by a human being. Turing (1950) described these machines as consisting of three parts, a store of information corresponding to human memory in which information is broken up into small packets or chunks, an executive unit carrying out individual operations contained in some ‘book of rules’, and a control verifying that the instructions maintained in the store have been correctly obeyed. The strength of Turing's idea was that the digital computer he envisioned should be conceived as a universal machine able to mimic any discrete state machine if suitably programmed. Accordingly, Newell and Simon (1956; Newell et al. 1958) showed that it was possible to simulate the cognitive processes involved in chess playing with heuristic programmes using a flexible information-processing language.
The fact that these simulations were successful strongly suggested an isomorphism between the basic organisation of human problem solving and the list structures for representing instructions that Newell and his collaborators used in programming their computer. Miller et al. (1960) called the cognitive structures corresponding in mind to the programmes in a computer ‘Plans’. They assumed that when the decision is made to execute a Plan, it is taken out of dead storage and brought into the focus of attention for controlling what they called a segment of information-processing capacity. Miller and colleagues (p. 65) emphasised that ‘something important’ happens to the selected Plan during this process and that its special cognitive status required some machinery for its maintenance and execution. This machinery was described as a ‘working memory’ in a paragraph that we shall exhaustively quote here for its visionary character.
The parts of a Plan that is being executed have special access to consciousness and special ways of being remembered that are necessary for coordinating parts of different Plans and for coordinating with the Plans of other people. When we have decided to execute some particular Plan, it is probably put into some special state of place where it can be remembered while it is being executed. Particularly if it is a transient, temporary kind of Plan that will be used today and never again, we need some special place to store it. The special place may be on a sheet of paper. Or (who knows?) it may be somewhere in the frontal lobes of the brain. Without committing ourselves to any specific machinery, therefore, we should like to speak of the memory we use for the execution of our Plans as a kind of quick-access, “working memory”. There may be several Plans, or several parts of a single Plan, all stored in working memory at the same time. In particular, when one Plan is interrupted by the requirement of some other Plan, we must be able to remember the interrupted Plan in order to resume its execution when the opportunity arises. When a Plan has been transferred into working memory we recognize the special status of its incompleted parts by calling them “intentions”.
(Miller et al. 1960, p. 65)
Apart from the relationships between working memory and attention (but as we saw above, Plans being executed are brought into the focus of attention), several key aspects of the modern conception of working memory are already present in this first description, such as the links between working memory and consciousness, its dual function of processing and storage through the need of remembering Plans while being executed, the fast access to the stored information, and even hypotheses about a cerebral localisation in the frontal cortex. The limited capacity of working memory is among the rare characteristics omitted by Miller and colleagues in their description. However, from its very inception, the information-processing approach emphasised the limitations of the cognitive system that any theoretical or computational model should reflect (Turing 1948). Needless to say, Miller was more than anybody else aware of the strong limitations of the information-processing system, suggesting in his famous article ‘The magical number seven, plus or minus two’ that the mean capacity of the channel for a range of stimulus variables was 2.6 bits with a standard deviation of only 0.6 bit, corresponding to a mean of 6.5 distinguishable alternatives and a total range from 3 to 5 categories, which he considered as a remarkably narrow range (Miller 1956). Interestingly, but according to Miller quite coincidentally1, the span of immediate memory was approximately equivalent and limited to about seven chunks.
Thus, conceiving the mind as a system that processes information following some internal programme instead of a learning device storing stimulus-response associations led to the necessity to conceive of a machinery with the capacity to represent the information to be processed, to remember it while executing the operations listed in the programme, and to control the outputs of these processes. Miller et al. (1960) coined the expression ‘working memory’ to describe such a system and identified its main characteristics, but they did not go further in describing its structure. The information-processing approach required a working memory, but what was the best candidate for this role within the conceptual toolbox of psychology in the middle of the twentieth century? Quite naturally, the response was ‘short-term memory’. As Atkinson and Shiffrin (1971) noted, the distinction between short-term memory and long-term memory that was introduced at the end of the nineteenth century by Ebbinghaus (1885) or James (1890) was largely discarded by behaviourism, but was reintroduced during the 1950s, receiving considerable theoretical developments. A variety of models of short-term memory were proposed (for example, Broadbent 1958; Waugh and Norman 1965; Atkinson and Shiffrin 1965) that presented a sufficient degree of similarity to be summarised by Murdock (1967) into a synthesis he named the ‘modal model’, in which were distinguished three main levels: a sensory store, a primary and a secondary memory. The model designed by Atkinson and Shiffrin (1968) was probably the most influential version of this ‘modal model’. They suggested that the entire memory system could be described in terms of a flow of information into and out of a short-term storage and the control that the subject can exert on that flow. In their model, information was first processed by sensory systems (visual, auditory or haptic for example) and entered into the short-term store where a variety of control processes could code the incoming information in different ways, maintain a limited number of items through rehearsal, copy these items into a long-term store, retrieve knowledge from this long-term store or make decisions for action. Atkinson and Shiffrin (1971) stressed that their account did not require the assumption that short-term and long-term stores involve different physiological structures or different neural substrates. Short-term memory could simply be the activated part of long-term memory, an idea that has been endorsed by several modern theories of working memory. More important in their view was what distinguishes the two stores, with two key and unique properties of short-term memory that would contain the thoughts and information of which we are currently aware, and that would be where control processes take place, these properties turning short-term memory into working memory. This was perfectly in line with Miller et al.'s (1960) suggestion to designate by the terms ‘working memory’ the memory used for remembering that part of the Plan currently executed that have special access to consciousness.
A seminal study: Baddeley and Hitch (1974)
Atkinson and Shiffrin were not isolated in assigning a crucial role to short-term memory in complex cognition such as learning (Peterson 1966), language comprehension (Rumelhart et al. 1972) or problem solving (Hunt 1971). However, it remained unclear at the time that the different roles of memory in information processing could be fulfilled by a single unitary system of short-term memory. For example, Posner (1967) agreed that because any information processing requires is to keep track of incoming information and bring this information into contact with stored knowledge, some short-term memory system was needed to fulfil these functions. However, he noted some ambiguity in the short-term memory literature of this time, which used a single mechanism for the ephemeral representation of a stimulus, the retention of new information over brief periods of time and the activation of information from long-term memory needed by complex cognition (what Hunter 1964 described as an operational memory), while there was no evidence that these three functions had identical limitations. In line with Posner, Baddeley and Hitch (1974) observed that, despite the fact that short-term memory had been frequently assigned the role of operational or working memory, empirical evidence for this hypothesis was remarkably sparse, opening their chapter by assuming that ‘we still know virtually nothing about its [short-term memory] role in normal human information processing’ (Baddeley and Hitch 1974, p. 47). Empirical evidence was not only sparse but often contradicted the idea that short-term memory played any role in complex cognition and processes like concept formation (Coltheart 1972) or retrieval (Patterson 1971). According to Baddeley and Hitch, the strongest evidence against the hypothesis equating short-term memory with working memory was the neuropsychological observations of Shallice and Warrington (Shallice and Warrington 1970; Warrington and Shallice 1969) who extensively studied patient K. F. who had a greatly reduced short-term memory capacity but preserved performance in learning, memory or comprehension. Thus, Atkinson and Shiffrin's hypothesis of a short-term storage acting as a working memory lacked empirical support. Consequently, Baddeley and Hitch presented a series of experiments that attempted to answer two main questions. The first concerned the existence of a common working memory system shared by reasoning, comprehension and learning. The second, if such a system existed, concerned its relations with short-term memory. This study played, and is still playing, such a role in how cognitive psychology conceives working memory that we will present it in some detail here. We will see that the way Baddeley and Hitch interpreted their results had a strong and enduring impact on cognitive psychology, whereas other interpretations, at least as plausible as those they endorsed, would have led to a totally different view of working memory structure and functioning.
Baddeley and Hitch (1974) ran a first series of experiments that addressed the role of short-term memory in reasoning, language and comprehension. They reasoned that if these processes share some common working memory system corresponding to the short-term memory described by Atkinson and Shiffrin (1971), and taking for granted that short-term memory has a limited capacity, then absorbing some of this capacity by a memory load should have a detrimental effect on concurrent cognitive processes. For this purpose, they asked participants to perform a reasoning task while holding items in memory using a preload technique. The reasoning task consisted in presenting a sentence that described the order of occurrence of two letters that immediately followed the sentence (for example, ‘A is not preceded by B – AB’). The difficulty of this task was varied by using passive sentences and introducing negations. Participants were asked to decide as quickly as possible whether the sentence correctly described the order in which the letters were presented by pressing appropriate keys for ‘true’ and ‘false’ responses. In a first experiment, participants were asked to perform this reasoning task while holding zero, one or two letters. This first experiment revealed no effect of memory load on solution times. This would suggest either that the memory system involved in maintaining the letters is not relevant for the reasoning task or that the memory preload was not sufficient. Thus, th...