invariance and Variability in Speech Processes
eBook - ePub

invariance and Variability in Speech Processes

  1. 632 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

invariance and Variability in Speech Processes

Book details
Book preview
Table of contents
Citations

About This Book

First published in 1986. The important implications of speech variability for the future of speech related technology, in combination with the multifaceted debate about invariance among speech scientists, make this a most appropriate time to evaluate the state our knowledge in this area. On October 8-10, 1983 researchers from the fields of production, perception, acoustics, pathology, psychology, linguistics, language acquisition, synthesis and recognition met at a. symposium at M.I.T. on invariance and variability of speech processes. This volume is the Proceedings of the symposium. Each chapter of the book consists of a focus paper followed by some comments.

Frequently asked questions

Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes, you can access invariance and Variability in Speech Processes by J. S. Perkell,D. H. Klatt in PDF and/or ePUB format, as well as other popular books in Psychology & History & Theory in Psychology. We have over one million books available in our catalogue for you to explore.

Information

Year
2014
ISBN
9781317768289
Edition
1

1
Toward a Model of the Development of Speech Perception

Peter W. Jusczyk
Department of Psychology
University of Oregon
From the infant's point of view, the task of sorting out the varied array of sounds directed toward him or her into a coherent set of signals appears to be a formidable one. The sources of variation in utterances produced by even the same speaker in the same context would appear to be overwhelming. The problem is only confounded when one considers the variations produced in noisy environments, by different adults of the same sex, by children and adults, or by males and females. Yet, somehow during the course of the first year of life, infants act as if they understand certain words, and a few months later are producing their own words. Perhaps, we should not be too surprised by this since similar feats of perceptual constancy appear to be presented in the visual domain (Day & McKenzie, 1973; Spelke, 1982), and even intermodally (Meltzoff, 1985; Rose, Gottfried, & Bridger, 1983). Nevertheless, the demonstration that invariance is found in other sensory modalities does not free us from the need to explain the instances of invariance that appear in connection with the infant's perception of speech sounds. Two speech perception phenomena would seem to require the infant's capacity to perceive an invariant relation across different tokens. The first is the well-known phenomenon of categorical perception; the second is the capacity to recognize the same utterances produced by different speakers. Both of these, phenomena have been the object of investigation in studies of infant speech perception. In what follows, we review briefly some of the major findings in the field of infant speech perception. We consider the possible mechanisms that might underlie the infant's basic capacities, and then discuss the ways in which these capacities might develop during the course of language acquisition.

Infant Speech Perception Capacities

Discrimination of Simple Speech Contrasts

One of the most important findings in the early studies of infant speech perception was Eimas, Siqueland, Jusczyk, & Vigorito's (1971) demonstration of the existence of categorical discrimination of voicing cues. This result indicated not only that infants were capable of discriminating certain speech sounds, but that their ability to do so was similar to that of adults in a very important way. Previously, investigators had invoked explanations based on acquired distinctiveness and learned equivalence of cues to account for categorical perception—i.e., the fact that adults are sensitive to subtle acoustic differences between members of different phonetic categories but relatively insensitive to differences of the same magnitude occurring within a particular phonetic category (Liberman, Harris, Kinney, & Lane, 1961). The basic argument was that through extensive practice in producing and perceiving speech, listeners would, in time, learn to treat variants of a particular phonetic segment as being the same and as different from variants of other phonetic segments. In particular, the perceived sounds would be referenced with respect to the articulatory gestures used to produce them. In effect, acoustic variations that arose in the attempts to produce the same articulatory gesture would be ignored in perception. A key factor in this process was extensive practice in assigning the same label to the variants of a particular phonetic segment; this presumably contributed to the difficulty which adults had in distinguishing one variant from another on standard speech discrimination test such as ABX. However, the Eimas et al. results indicated that such categorical effects were present in the response of infants who had not been subjected to a long period of discriminative training. Hence, many researchers began to view categorical perception as a consequence of the way in which the human perceptual system is structured right from birth (Eimas & Corbit, 1973). Moreover, because at the time categorical perception was thought to occur only for speech sounds, these findings were taken to be an indication of an innate linguistic capacity (Cutting & Eimas, 1975; Eimas et al., 1971).
In the following years, many studies were undertaken to determine the variety of contrasts that infants are capable of perceiving. These studies indicate that infants are capable of discriminating virtually every type of phonetic contrast that they have been tested on (for a complete review, see Aslin, Pisoni, & Jusczyk, 1983). For example, there is evidence that infants are sensitive to place of articulation differences between stops (Eimas, 1974; Miller, Morse, & Dorman, 1977; Moffit, 1971; Morse, 1972; Till, 1976), fricatives (Holmberg, Morgan, & Kuhl, 1977; Jusczyk, Murray, & Bayly, 1979), glides (Jusczyk, Copan, & Thompson, 1978), and nasals (Eimas & Miller, 1977). Similarly, infants have been shown to discriminate manner of articulation contrasts such as those between stops and nasals (Eimas & Miller, 1980a), stops and glides (Eimas & Miller, 1980a; Hillenbrand, Minilie, & Edwards, 1979; Miller and Eimas, 1983), liquids (Eimas, 1975), and nasalized and nonnasalized vowels (Trehub, 1976a). Also, infants are able to discriminate a variety of vowel contrasts including [l]—[i] (Swoboda, Kass, Morse, & Leavitt, 1978; Swoboda, Morse, & Leavitt, 1970), [a]—[i] (Kuhl, 1979; Trehub), 1973), [a]-[ɔ] (Kuhl & Miller, 1982) and [i]—[u] (Trehub, 1973). Therefore, it is quite apparent that the underlying capacities which infants possess for discriminating speech sounds extend well beyond those required for the detection of voicing contrasts.
Moreover, the infant's capacity for discriminating phonetic contrasts does not require specific experience with the sounds to be discriminated. Lasky, Syrdal-Lasky, & Klein (1975) found that infants from a Spanish-speaking environment showed sensitivity in the same region of the voice-onset-time (VOT) continuum as did Eimas et al.'s subjects, despite the fact that the VOT boundary for adult speakers in the same environment differed considerably from that of English speakers. Likewise, Streeter (1976) found that the discrimination performance for infants from a Kikuyu-speaking culture corresponded well to that of infants in Eimas et al.'s study. Moreover, other research suggested that cross-language commonalities in perceptual boundaries for infants are not limited to the perception of VOT differences. In one experiment, Trehub (1976a) found that 1- to 4-month old infants from English-speaking homes were capable of discriminating the fricative contrast ([řa]/[za]) that occurs in Czech but not English. In a second experiment, she obtained a similar result for the discrimination of the nasalized/nonnasalized contrast between ([pã]/[pa]), a contrast that is phonemic in Polish and in French but not in English. More recently, Werker and Tees (1983) have reported that 6- to 8-month old infants from English-speaking home are capable of discriminating contrasts from Hindi and from an American Indian language, Thompson. Interestingly enough, a follow-up study suggested that by 8- to 10-months old, the capacity of infants to discriminate these contrasts is attenuated. Evidently, the influence of the Infant's native language-learning environment during the latter portion of the first year of life may desensitize the infant to those contrasts not indigenous to the native language. In this sense, specific experience appears to play a role in the maintenance of, as opposed to the acquisition of, discriminative capacities in the infant.

Discrimination of Speech Contrasts in Different Utterance Contexts

In addition to tracking the variety of contrasts that infants are capable of discriminating, a number of researchers have investigated the way in which various contextual factors affect the infant's discriminative capacities. One approach has been to look at the effect or varying the location of the phonetic contrast in an utterance. The early studies in the field had all employed contrasts between the initial segments of single syllables. Hence, there was no way of determining from these investigations if infants processed information beyond the initial segment. Jusczyk (1977) attacked this problem in a study in which he looked at infant's ability to detect a [d]-[g] contrast occurring in either the initial or final segment of CVC syllables. His results indicated that, infants are capable of processing phonetic differences beyond the initial segments of syllables (see also Williams & Bush, 1978). Moreover, there was no evidence that the syllable-final contrastswere any less discriminable for infants than syllable-initial ones. Interestingly, this latter result contrasts with findings observed for studies of phonemic perception in infants 1 year of age and older (Garnica, 1973; Shvachkin, 1973). Possible reasons for this discrepancy are considered below.
Additional studies have explored the infant's discrimination of phonetic contrasts in multisyllabic utterances. Again it was found that infants have the capacity to detect contrasts between segments occurring in other than the utterance-initial position (Jusczyk, Copan, & Thompson, 1978; Jusczyk & Thompson, 1978; Trehub, 1976b; Williams, 1977a). Studies with multisyllabic tokens also make it possible to examine how discrimination is affected by the presence of information regarding syllable stress. To date, there is no indication from young infants that unstressed syllables are any less discriminable than stressed syllables (Jusczyk, Copan, & Thompson, 1978; Jusczyk & Thompson, 1978; Williams, 1977a).

Perceptual Constancy

Equally important as the ability to discriminate phonetic contrasts regardless of their position in an utterance is the ability to recognize the same phonetic segment when spoken by different speakers or with a different inflection. The acoustic characteristics of speech sounds vary greatly from speaker to speaker, yet the adult listener is able to ignore such differences in recognizing the identity of a given word. In effect, perception of the phonemic segments is invariant across these differences. Kuhl and her coworkers have looked at the infant's capacity to ignore irrelevant differences in speaker's voice and intonation patterns in making phonetic discriminations. They first trained infants to discriminate between single tokens of two different syllables spoken by the same speaker. Then, in successive phases of the experiment they introduced new tokens of the syllables spoken by different speakers and with varying intonation contours. The infants were deemed to have achieved some degree of perceptual constancy for the phonetic segments being tested if they could successfully maintain the discrimination between the two types of segments in the face of the irrelevant changes introduced by adding new tokens varying the intonation patterns and speaker's voice. Kuhl found evidence that 6-month old infants are able to ignore changes in intonation patterns and speakers voices for both vowel (Kuhl, 1979; 1983) and fricative (Holmberg, Morgan, & Kuhl, 1977) segments.

Infant Speech Perception Capacities: Summary

Infant speech perception studies have revealed a number of things regarding the infant's perceptual capacities. First, categorical discrimination along certain phonetic continua is present for infants as well as adults. Second, the infant is able to successfully discriminate a wide variety of contrasts within the first 2 or 3 months of life. Third, little or no experience appears to be required for making phonetic distinctions because the infant is able to discriminate contrasts that are not present in the native language-learning environment. Fourth, the infant is able to process information about phonetic segments in noninitial positions of utterances. Fifth, the infant is sensitive to phonetic contrasts occurring in unstressed as well as stressed syllables. Sixth, the infant displays some capacity for perceptual constancy in that he or she is able ignore differences in speaker's voices and intonation contours in making phonetic discriminations. Therefore, the infant possesses many of the perceptual capacities required for analyzing the acoustic stream of speech and recovering the phonetic structure of the native language that he or she will be trying to acquire. In particular, the young infant seems to be well-equipped to cope with the variability present in the speech signal as evidenced by the demonstrations of both categorical discrimination and perceptual constancy across changes in intonation and talker.

On the Question of Phonetic Capacities

The Case for Specialized Phonetic Processing Mechanisms in Infants

Given the complement of abilities that the young infant possesses for discriminating speech sounds, one is tempted to conclude that there is very little in the way of development of speech perception, save for the attenuated abilities with foreign contrasts. Hence, one could argue that the infant comes equipped with specialized phonetic capacities. As noted earlier, when Eimas, Siqueland, Jusczyk, & Vigorito (1971) conducted their study they concluded that the infant was endowed with mechanisms specialized for processing language. However, this claim was based on an assumption, now known to be false, that categorical perception occurs only with speech sounds. Since that time, there have been numerous demonstrations of categorical perception with nonspeech stimuli (Miller, Wier, Pastore, Kelly, & Dooling, 1976; Pastore, Ahroon, Buffuto, Friedman, Puleo, & Fink, 1977; Pisoni, 1977). Moreover, there have also been indications that categorical perception for speech dimensions can be found in nonhuman species such as the chinchilla (Kuhl & Miller, 1975; 1978). Hence, the mere demonstration that infants exhibit categorical discrimination for speech would not, appear to provide sufficient grounds for claiming specialized speech perception capacities exist in infants.
Nevertheless, there are other grounds on which one might base a case in favor of the existence of specialized speech processing mechanisms. A reasonable way to support a claim for specialized speech mechanisms would be to demonstrate that infants process speech sounds differently than they do nonspeech sounds. Some evidence in favor of such a speech-nonspeech processing difference was reported in studies conducted by Eimas (1974; 1975) and Till (1976). Eimas used nonspeech patterns called "chirps" which were truncated versions of the speech syllables that he employed. Specifically, the chirps consisted of only the second (Eimas, 1974) or third (Eimas, 1975) formant transition portion of the speech syllables. Since the only source of acoustic variation which occurred between the speech syllable pairs were differences in the second or third formant transitions, Eimas reasoned that the chirps served as appropriate nonspeech controls. In particular, he argued that the acoustic differences that infants had to discriminate were the same in the speech and nonspeech test pairs. The results of his investigations indicated the infants processed the chirps and speech syllables differently. Discrimination performance for the speech contrasts tended to be categorical in that between-category contrasts (e.g., [ba] vs. [da]) were discriminated, but within-category contrasts (e.g., [ba1] vs. [ba2]) were not. By comparison, discrimination of the nonspeech contrasts was continuous with no differences in performance evident for between-category and within-category contrast. Till (1976) found similar results in his study which employed a different set of nonspeech controls.

An Alternative Explanation: Infant Speech Perception Mediated by General Auditory Mechanisms

The results of these studies involving speech-nonspeech comparisons would appear to provide two grounds for contending that infants possess specialized speech processing mechanisms. First, categorical discrimination was obtained only with speech contrasts. Second, the same acoustic information was apparently processed differently in speech and nonspeech contexts. However, subsequent research has undercut both of these theories. First, it has been demonstrated that categorical discrimination does occur with certain nonspeech contrasts (Jusczyk, Pisoni, Reed, Fernald, & Myers, 1983; Jusczyk, Pisoni, Walley, & Murray, 1980). Hence, for the infant, as well as the adult, categorical discrimination is not limited to speech. Second, the assumption that the same acoustic information is available in both nonspeech chirps and speech syllables has also been challenged (Jusczyk, Smith, & Murphy, 1981; Pisoni, 1976). In particular, Jusczyk, Smith, & Murphy (1981) have suggested that the omission of first formant transition information from chirp stimuli deprives the listener of a context against which to evaluate differences in second or third formant transition differences. They found marked differences in the way in which adults processed nonspeech chirp stimuli with and without accompanying first formant information. Hence, it, cannot be assumed that infants are processing the same acoustic differences in the syllable and chirp stimuli, especially if the perceptual analysis is conducted not on the individual formants, but on the relationships between the formants. Therefore, the more recent studies of nonspeech processing by infants (Jusczyk, Pisoni, Walley, & Murray, 1980; Jusczyk, Pisoni, Reed, Fernald, & Myers, 1983) would seem to favor a common explanation for speech and nonspeech processing by infants.
Similarly, the discovery that infants display perceptual constancy across different talkers is actually better handled by an explanation in terms of general auditory capacities rather than specialized speech capacities. Studies such as that of Peterson and Barney (1952) have shown that there is a great deal of acoustic variation in tokens produced by different speakers; at first glance, it would appear that the only commonality that exists in tokens of the same syllabic uttered by different speakers is phonetic rather than acoustic. However, nonhuman mammalian species such as the dog (Baru, 1975) and the chinchilla (Burdick & Miller, 1975) are apparently capable of adjusting at least to variations in speaker's voice and intonation contour. Hence, the mechanisms that extract constancies of this sort appear to be generally available in the mammalian auditory system, suggesting a basis in some measure of the overall similarity of the acoustic patterns rather than an analysis into speech-related component dimensions.
Recently, another type of finding has been offered as evidence of specialized speech processing by infants. Eimas and Miller (1980a) found that the infant's discrimination of formant transition duration differences used to signal a contrast between [ba] and [wa] depended upon contextual information in the form of syllabic duration even though the information for syllable duration came well after the transition information. The argument that the infants' behavior in this setting is indicative of specialized speech processing mechanisms rests on certain assumptions drawn from a study with adults by Miller and Liberman (1979). Specifically, the latter found that the context effects were not simply attributable to an overall increase in stimulus duration, because changes in overall duration that were not associated with changes in speaking rate (such as adding a new phonetic segment) did not produce the context effects. Instead, only a change in duration associated with a change in speaking rale (e.g. lengthening the vocalic portion of the syllable) yielded the context effects. Eimas and Miller (1980a; see also, Miller & Eimas, 1983) employed similar logic in arguing for phonetic processing effects in their study with infants. Unfortunately, unlike Miller and Liberman (1979), they were not able to assess the conseque...

Table of contents

  1. Cover
  2. Title
  3. Copyright
  4. Preface
  5. Dedication
  6. Dedication
  7. Acknowledgements
  8. Contents
  9. Contributors
  10. 1. Toward a Model of the Development of Speech Perception
  11. 2. Discovering Sound Units and Constructing Sound Systems: It's Child's Play
  12. 3. Sources of Variability in Early Speech Development
  13. 4. On the Genetic Basis of Linguistic Variation
  14. 5. Auditory Models as Front Ends in Speech Recognition Systems
  15. 6. Speech Perception as Vector Analysis: An Approach to the Problem of Invariance and Segmentation
  16. 7. Variation and Interaction in Speech
  17. 8. Analysis of French Stop Consonants Using a Model of the Peripheral Auditory System
  18. 9. On Acoustic Invariance in Speech
  19. 10. Invariance and Variability in Speech Production: A Distinction Between Linguistic Intent and Its Neuromotor Implementation
  20. 11. Relative Invariance of Articulatory Movements: An Iceberg Model
  21. 12. Temporal Invariance in the Production of Speech
  22. 13. Invariance and Variability in Speech Timing: From Utterance to Segment in German
  23. 14. Problem of Variability in Speech Recognition and in Models of Speech Perception
  24. 15. Performing Fine Phonetic Distinctions: Templates versus Features
  25. 16. Normalization and Vowel Perception
  26. 17. Exploiting Lawful Variability in the Speech Wave
  27. 18. Phonological Evidence for Top-Down Processing in Speech Perception
  28. 19. Sources of Inherent Variation in the Speech Process
  29. 20. Toward a Phonetic and Phonological Theory of Redundant Features
  30. 21. Variability of Feature Specifications
  31. 22. Features—Fiction and Facts
  32. 23. On the Origin and Purpose of Discreteness and Invariance in Sound Patterns
  33. 24. Invariance and Variability of Words in the Speech Chain
  34. 25. Invariance in Phonetics
  35. References
  36. Index