Introduction to the Problem of Mental Storage Capacity

One of the central contributions of cognitive psychology has been to explore limitations in the human capacity to store and process information. Although the distinction between a limited-capacity primary memory and an unlimited-capacity secondary memory was described by James (1890), Miller's (1956) theoretical review of a "magical number seven, plus or minus two" is probably the most seminal paper in the literature for investigations of limits in short-term memory (STM) storage capacity. It was, in fact, heralded as one of the most influential Psychological Review papers ever, in a 1994 centennial issue of the journal. Miller's reference to a magical number, however, was probably just a rhetorical device. A more central focus of his article was the ability to increase the effective storage capacity through the use of intelligent grouping or "chunking" of items. He ultimately suggested that the specific limit of 7 probably emerged as a coincidence.

Over 40 years later, we are still uncertain as to the nature of storage capacity limits. According to some current theories there is no limit in storage capacity per se, but a limit in the duration for which an item can remain active in STM without rehearsal (e.g., Baddeley, 1986; Richman, Staszewski, & Simon, 1995). This has led to a debate about whether the limitation is a "magic number or magic spell" (Schweickert & Boruff, 1986) or whether rehearsal really plays a role (Brown & Hulme, 1995). One possible resolution is that the focus of attention is capacity-limited whereas various supplementary storage mechanisms, which can persist temporarily without attention, are time-limited rather than capacity-limited (Cowan, 1988, 1995). Other investigators, however, have long questioned whether temporary storage concepts are necessary at all, suggesting that the rules of learning and memory could be identical in both the short and long term (Crowder, 1993; McGeoch, 1932; Melton, 1963; Nairne, 1992; Neath, 1998).

At present, the basis for believing that there is a time limit to STM is controversial and unsettled (Cowan, Saults, & Nugent, 1997; Cowan, Wood, Nugent, & Treisman 1997; Crowder, 1993; Neath & Nairne, 1995; Service, 1998). The question is nearly intractable because any putative effect of the passage of time on memory for a particular stimulus could instead be explained by a combination of various types of proactive and retroactive interference from other stimuli. In any particular situation, what looks like decay could instead be displacement of items from a limited-capacity store over time. If the general question of whether there is a specialized STM mechanism is to be answered in the near future then, given the apparent unresolvability of the decay issue, general STM questions seem more likely to hinge on evidence for or against a chunk-based capacity limit.

The evidence regarding this capacity limit also has been controversial. According to one view (Wickens, 1984) there is not a single capacity limit, but several specialized capacity limits. Meyer and Kieras (1997) questioned the need for capacity limits to explain cognitive task performance; they instead proposed that performance scheduling concerns (and the need to carry out tasks in the required order) account for apparent capacity limits. The goal of this target article is to provide a coherent account of the evidence on storage capacity limits to date.

One reason why a resolution may be needed is that, as mentioned above, the theoretical manifesto announcing the existence of a capacity limit (Miller, 1956) did so with considerable ambivalence toward the hypothesis. Although Miller's ambivalence was, at the time, a sophisticated and cautious response to available evidence, a wealth of subsequent information suggests that there is a relatively constant limit in the number of items that can be stored in a wide variety of tasks; but that limit is only 3 to 5 items as the population average. Henderson (1972, p. 486) cited various studies on the recall of spatial locations or of items in those locations, conducted by Sperling (1960), Sanders (1968), Posner (1969), and Scarborough (1971), to make the point that there is a "new magic number 4 + 1." Broadbent (1975) proposed a similar limit of 3 items on the basis of more varied sources of information including, for example, studies showing that people form clusters of no more than 3 or 4 items in recall. A similar limit in capacity was discussed, with various theoretical interpretations, by others such as Halford, Maybery, and Bain (1988), Halford, Wilson, and Phillips (1998), Luck and Vogel (1997), and Schneider and Detweiler (1987).

The capacity limit is open to considerable differences of opinion and interpretation. The basis of the controversy concerns the way in which empirical results should be mapped onto theoretical constructs. Those who believe in something like a 4-chunk limit acknowledge that it can be observed only in carefully constrained circumstances. In many other circumstances, processing strategies can increase the amount that can be recalled. The limit can presumably be predicted only after it is clear how to identify independent chunks of information. Thus, Broadbent (1975, p. 4) suggested that "The traditional seven arises... from a particular opportunity provided in the memory span task for the retrieval of information from different forms of processing."

The evidence provides broad support for what can be interpreted as a capacity limit of substantially fewer than Miller's 7 + 2 chunks; about 4 chunks on the average. Against this 4-chunk thesis, can delineate at least 7 commonly held opposing views: (1) There are capacity limits but that they are in line with Miller's 7 + 2 (e.g., still taken at face value by Lisman & Idiart, 1995). (2) Short-term memory is limited by the amount of time that has elapsed rather than by the number of items that can be held simultaneously (e.g., Baddeley, 1986). (3) There is no special short-term memory faculty at all; all memory results obey the same rules of mutual interference, distinctiveness, etc. (e.g., Crowder, 1993). (4) There may be no capacity limits per se but only constraints such as scheduling conflicts in performance and strategies for dealing with them (e.g., Meyer & Kieras, 1997). (5) There are multiple, separate capacity limits for different types of material (e.g., Wickens, 1984). (6) There are separate capacity limits for storage versus processing (Daneman & Carpenter, 1980; Halford et al., 1998). (7) Capacity limits exist, but they are completely task-specific, with no way to extract a general estimate. (This may be the "default" view today.) Even among those who agree with the 4-chunk thesis, moreover, a remaining possible ground of contention concerns whether all of the various phenomena that I will discuss are legitimate examples of this capacity limit.

These seven competing views will be re-evaluated in Section 4 (4.3.1-4.3.7). The importance of identifying the chunk limit in capacity is not only to know what that limit is, but more fundamentally to know whether there is such a limit at all. Without evidence that a consistent limit exists, the concepts of chunking and capacity limits are themselves open to question.

1.1 Pure capacity-based and compound STM estimates. I will call the maximum number of chunks that can be recalled in a particular situation as the memory storage capacity, and valid, empirically obtained estimates of this number of chunks will be called estimates of capacity-based STM. Although that chunk limit presumably always exists, it is sometimes not feasible to identify the chunks inasmuch as long-term memory information can be used to create larger chunks out of smaller ones (Miller, 1956), and inasmuch as time- and interference-limited sources of information that are not strictly capacity-limited may be used along with capacity-limited storage to recall information. In various situations, the amounts that can be recalled when the chunks cannot be specified, or when the contribution of non-capacity-limited mechanisms cannot be assessed, will be termed compound STM estimates. These presumably play an important role in real-world tasks such as problem-solving and comprehension (Daneman, & Merikle, 1996; Logie, Gilhooly, & Wynn, 1994; Toms, Morris, & Ward, 1993). However, the theoretical understanding of STM can come only from knowledge of the basic mechanisms contributing to the compound estimates, including the underlying capacity limit. The challenge is to find sound grounds upon which to identify the pure capacity-based limit as opposed to compound STM limits.

1.2. Specific conditions in which a pure storage capacity limit can be observed. It is proposed here that there are at least four ways in which pure capacity limits might be observed: (1) when there is an information overload that limits chunks to individual stimulus items, (2) when other steps are taken specifically to block the recoding of stimulus items into larger chunks, (3) when performance discontinuities caused by the capacity limit are examined, and (4) when various indirect effects of the capacity limit are examined. Multiple procedures fit under each of these headings. For each of these, the central assumption is that the procedure does not enable subjects to group items into higher-order chunks. Moreover, the items must be familiar units with no pre-existing associations that could lead to the encoding of multi-object groups, ensuring that each item is one chunk in memory. Such assumptions are strengthened by an observed consistency among results.

The first way to observe clearly limited-capacity storage is to overload the processing system at the time that the stimuli are presented, so that there is more information in auxiliary or time-limited stores than the subject can rehearse or encode before the time limit is up. This can be accomplished by presenting a large spatial array of stimuli (e.g., Sperling, 1960) or by directing attention away from the stimuli at the time of their presentation (Cowan, Nugent, Elliott, Ponomarev, & Saults, 1999). Such manipulations make it impossible during the presentation of stimuli to engage in rehearsal or form new chunks (by combining items and by using long-term memory information), so that the chunks to be transferred to the limited-capacity store at the time of the test cue are the original items presented.

The second way is with experimental conditions designed to limit the long-term memory and rehearsal processes. For example, using the same items over and over on each trial and requiring the recall of serial order limits subjects' ability to think of ways to memorize the stimuli (Cowan, 1995); and rehearsal can be blocked through the requirement that the subject repeat a single word over and over during the stimulus presentation (Baddeley, 1986).

The third way is to focus on abrupt changes or discontinuities in basic indices of performance (proportion correct and reaction time) as a function of the number of chunks in the stimulus. Performance on various tasks takes longer and is more error-prone when it involves a transfer of information from time-limited buffers, or from long-term memory, to the capacity-limited store than when it relies on the contents of capacity-limited storage directly. This results in markedly less accurate and/or slower performance when more than 5 items must be held than when fewer items must be held (e.g., in enumeration tasks such as that discussed by Mandler & Shebo, 1982).

Fourth, unlike the previous methods, which have involved an examination of the level of performance in the memory task, there also are indirect effects of the limit in capacity. For example, lists of items tend to be grouped by subjects into chunks of about 4 items for recall (Broadbent, 1975; Graesser & Mandler, 1978), and the semantic priming of one word by another word or learning of contingencies between the words appears to be much more potent if the prime and target are separated by about 3 or fewer words (e.g., McKone, 1995).

1.2.1. Other restrictions on the evidence. Although these four methods can prevent subjects from amalgamating stimuli into higher-order chunks, the resulting capacity estimates can be valid only if the items themselves reflect individual chunks, with strong intra-chunk associations and weak or (ideally) absent inter-chunk associations. For example, studies with nonsense words as stimuli must be excluded because, in the absence of pre-existing knowledge of the novel stimulus words, each word may be encoded as multiple phonemic or syllabic subunits with only weak associations between these subunits (resulting in an underestimate of capacity). As another example, sets of dots forming familiar or symmetrical patterns would be excluded for the opposite reason, that multiple dots could be perceived together as a larger object with non-negligible inter-dot associations, so that each dot would not be a separate chunk (resulting in an overestimate of capacity). It also is necessary to exclude procedures in which the central capacity's contents can be recalled and the capacity then re-used (e.g., if a visual array remains visible during recall) or, conversely, in which the information is not available long enough or clearly enough for the capacity to be filled even once (e.g., brief presentation with a mask). In Section 3, converging types of evidence will be offered as to the absence of inter-item chunking in particular experimental procedures (e.g., a fixed number of items correctly recalled regardless of the list or array size).

Finally, it is necessary to exclude procedures in which the capacity limit must be shared between chunk storage and the storage of intermediate results of processing. One example of this is the "n-back task" in which each item in a continuous series must be compared with the item that occurred n items ago (e.g., Cohen et al., 1997; Poulton, 1954) or a related task in which the subject must listen to a series of digits and detect three odd digits in a row (Jacoby, Woloshyn, & Kelley, 1989). In these tasks, in order to identify a fixed set of the most recent n items in memory, the subject must continually update the target set in memory. This task requirement may impose a heavy additional storage demand. These demands can explain why such tasks remain difficult even with n = 3.

It may be instructive to consider a hypothetical version of the n-back task that would be taken to indicate the existence of a special capacity limit. Suppose that the subject's task were to indicate, as rapidly as possible, if a particular item had been included in the stimulus set previously. Some items would be repeated in the set but other, novel items also would be introduced. On positive trials, the mean reaction time should be much faster when the item had been presented within the most recent 3 or 4 items than when it was presented only earlier in the sequence. To my knowledge, such a study has not been conducted. However, in line with the expectation, probed recall experiments have resulted in shorter reaction times for the most recent few items (Corballis, 1967).

The present view is that a strong similarity in pure capacity limits (to about 4 chunks on average) can be identified across many test procedures meeting the above four criteria. The subcategories of methods and some key references are summarized in Table 1 (see section 3), and each area will be described in more detail in Section 3 of the review.

1.3. Definition of chunks. A chunk must be defined with respect to associations between concepts in long-term memory. I will define the term chunk as a collection of concepts that have strong associations to one another and much weaker associations to other chunks concurrently in use. (This definition is related to concepts discussed by Simon, 1974). It would be assumed that the number of chunks can be estimated only when inter-chunk associations are of no use in retrieval in the assigned task. To use a well-worn example inspired by Miller (1956), suppose one tries to recall the series of letters, "fbicbsibmirs." Letter triads within this sequence (FBI, CBS, IBM, and IRS) are well-known acronyms, and someone who notices that can use the information to assist recall. For someone who does notice, there are pre-existing associations between letters in a triad that can be used to assist recall of the 12-letter sequence. If we further assume that there are no pre-existing associations between the acronyms, then the four of them have to occupy limited-capacity storage separately to assist in recall. If that is the case, and if no other optional mnemonic strategies are involved, then successful recall of the 12-item sequence indicates that the pure capacity limit for the trial was at least 4 chunks. (In practice, within the above example there are likely to be associations between the acronyms. For example, FBI and IRS represent two U.S. government agencies, and CBS and IBM represent two large U.S. corporations. Such associations could assist recall. For the most accurate pure capacity-based limit, materials would have to be selected so as to eliminate such special associations between chunks.) Notice that the argument is not that long-term memory fails to be involved in capacity-based estimates. Long-term memory is inevitably involved in memory tasks. The argument is that the purest capacity estimates occur when long-term memory associations are as strong as possible within identified chunks and absent between those identified chunks.

If someone is given new material for immediate recall and can look at the material long enough before responding, new associations between the original chunks can be formed, resulting in larger chunks or, at least, conglomerates with nonzero associations between chunks. McLean and Gregg (1967, p. 455) provided a helpful description of chunks in verbal recall, as "groups of items recited together quickly," helpful because recall timing provides one good indication of chunking (see also Anderson & Matessa, 1997). McLean and Gregg (p. 456) described three ways in which chunks can be formed: "(a) Some stimuli may already form a unit with which S is familiar. (b) External punctuation of the stimuli may serve to create groupings of the individual elements. (c) The S may monitor his own performance and impose structure by selective attention, rehearsal, or other means."

The practical means to identify chunks directly is an important issue, but one that is more relevant to future empirical work than it is to the present theoretical review of already-conducted work, inasmuch as few researchers have attempted to measure chunks directly. Direct measures of chunks can include empirical findings of item-to-item associations that vary widely between adjacent items in a list, being high within a chunk and low between chunks; item-to-item response times that vary widely, being relatively short within a chunk and long between chunks; and subjective reports of grouping. For studies in which the main dependent measure is not overt recall, measures of chunking for a trial must follow the trial immediately if it cannot be derived from the main dependent measure itself. Schneider and Detweiler (1987, pp. 105-106) provide an excellent further discussion of how chunks can be identified through convergent measures.

For most of the research that will be summarized in Section 3 below, however, the researchers provided no direct evidence of chunking or its absence. The present assumption for these studies is that chunk size can be reasonably inferred from the presence of the task demands described above in Section 1.2, which should prevent inter-item chunking. The present thesis is that the great similarity of empirically-based chunk limits derived using these guidelines, reviewed in Section 3, supports their validity because the guidelines yield a parsimonious, relatively uniform description of capacity limits of 3 to 5 chunks as the population average (with a maximum range of 2 to 6 chunks in individuals).


Понравилась статья? Добавь ее в закладку (CTRL+D) и не забудь поделиться с друзьями:  



double arrow
Сейчас читают про: