首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Infants' representations of the sound patterns of words were explored by examining the effects of talker variability on the recognition of words in fluent speech. Infants were familiarized with isolated words (e.g., cup and dog) from 1 talker and then heard 4 passages produced by another talker, 2 of which included the familiarized words. At 7.5 months of age, infants attended longer to passages with the familiar words for materials produced by 2 female talkers or 2 male talkers but not for materials by a male and a female talker. These findings suggest a strong role for talker-voice similarity in infants' ability to generalize word tokens. By 10.5 months, infants could generalize different instances of the same word across talkers of the opposite sex. One implication of the present results is that infants' initial representations of the sound structure of words not only include phonetic information but also indexical properties relating to the vocal characteristics of particular talkers.  相似文献   

2.
Two talkers' productions of the same phoneme may be quite different acoustically, whereas their productions of different speech sounds may be virtually identical. Despite this lack of invariance in the relationship between the speech signal and linguistic categories, listeners experience phonetic constancy across a wide range of talkers, speaking styles, linguistic contexts, and acoustic environments. The authors present evidence that perceptual sensitivity to talker variability involves an active cognitive mechanism: Listeners expecting to hear 2 different talkers differing only slightly in average pitch showed performance costs typical of adjusting to talker variability, whereas listeners hearing the same materials but expecting a single talker or given no special instructions did not show these performance costs. The authors discuss the implications for understanding phonetic constancy despite variability between talkers (and other sources of variability) and for theories of speech perception. The results provide further evidence for active, controlled processing in real-time speech perception and are consistent with a model of talker normalization that involves contextual tuning.  相似文献   

3.
We conducted four experiments to investigate the specificity of perceptual adjustments made to unusual speech sounds. Dutch listeners heard a female talker produce an ambiguous fricative [?] (between [f] and [s]) in [f]- or [s]-biased lexical contexts. Listeners with [f]-biased exposure (e.g., [witlo?]; from witlof, "chicory"; witlos is meaningless) subsequently categorized more sounds on an [epsilonf]-[epsilons] continuum as [f] than did listeners with [s]-biased exposure. This occurred when the continuum was based on the exposure talker's speech (Experiment 1), and when the same test fricatives appeared after vowels spoken by novel female and male talkers (Experiments 1 and 2). When the continuum was made entirely from a novel talker's speech, there was no exposure effect (Experiment 3) unless fricatives from that talker had been spliced into the exposure talker's speech during exposure (Experiment 4). We conclude that perceptual learning about idiosyncratic speech is applied at a segmental level and is, under these exposure conditions, talker specific.  相似文献   

4.
For nearly two decades it has been known that infants' perception of speech sounds is affected by native language input during the first year of life. However, definitive evidence of a mechanism to explain these developmental changes in speech perception has remained elusive. The present study provides the first evidence for such a mechanism, showing that the statistical distribution of phonetic variation in the speech signal influences whether 6- and 8-month-old infants discriminate a pair of speech sounds. We familiarized infants with speech sounds from a phonetic continuum, exhibiting either a bimodal or unimodal frequency distribution. During the test phase, only infants in the bimodal condition discriminated tokens from the endpoints of the continuum. These results demonstrate that infants are sensitive to the statistical distribution of speech sounds in the input language, and that this sensitivity influences speech perception.  相似文献   

5.
Visual information provided by a talker's mouth movements can influence the perception of certain speech features. Thus, the "McGurk effect" shows that when the syllable (bi) is presented audibly, in synchrony with the syllable (gi), as it is presented visually, a person perceives the talker as saying (di). Moreover, studies have shown that interactions occur between place and voicing features in phonetic perception, when information is presented audibly. In our first experiment, we asked whether feature interactions occur when place information is specificed by a combination of auditory and visual information. Members of an auditory continuum ranging from (ibi) to (ipi) were paired with a video display of a talker saying (igi). The auditory tokens were heard as ranging from (ibi) to (ipi), but the auditory-visual tokens were perceived as ranging from (idi) to (iti). The results demonstrated that the voicing boundary for the auditory-visual tokens was located at a significantly longer VOT value than the voicing boundary for the auditory continuum presented without the visual information. These results demonstrate that place-voice interactions are not limited to situations in which place information is specified audibly.(ABSTRACT TRUNCATED AT 250 WORDS)  相似文献   

6.

The nondeterministic relationship between speech acoustics and abstract phonemic representations imposes a challenge for listeners to maintain perceptual constancy despite the highly variable acoustic realization of speech. Talker normalization facilitates speech processing by reducing the degrees of freedom for mapping between encountered speech and phonemic representations. While this process has been proposed to facilitate the perception of ambiguous speech sounds, it is currently unknown whether talker normalization is affected by the degree of potential ambiguity in acoustic-phonemic mapping. We explored the effects of talker normalization on speech processing in a series of speeded classification paradigms, parametrically manipulating the potential for inconsistent acoustic-phonemic relationships across talkers for both consonants and vowels. Listeners identified words with varying potential acoustic-phonemic ambiguity across talkers (e.g., beet/boat vs. boot/boat) spoken by single or mixed talkers. Auditory categorization of words was always slower when listening to mixed talkers compared to a single talker, even when there was no potential acoustic ambiguity between target sounds. Moreover, the processing cost imposed by mixed talkers was greatest when words had the most potential acoustic-phonemic overlap across talkers. Models of acoustic dissimilarity between target speech sounds did not account for the pattern of results. These results suggest (a) that talker normalization incurs the greatest processing cost when disambiguating highly confusable sounds and (b) that talker normalization appears to be an obligatory component of speech perception, taking place even when the acoustic-phonemic relationships across sounds are unambiguous.

  相似文献   

7.
In six experiments with English‐learning infants, we examined the effects of variability in voice and foreign accent on word recognition. We found that 9‐month‐old infants successfully recognized words when two native English talkers with dissimilar voices produced test and familiarization items ( Experiment 1 ). When the domain of variability was shifted to include variability in voice as well as in accent, 13‐, but not 9‐month‐olds, recognized a word produced across talkers when only one had a Spanish accent ( Experiments 2 and 3 ). Nine‐month‐olds accommodated some variability in accent by recognizing words when the same Spanish‐accented talker produced familiarization and test items ( Experiment 4 ). However, 13‐, but not 9‐month‐olds, could do so when test and familiarization items were produced by two distinct Spanish‐accented talkers ( Experiments 5 and 6 ). These findings suggest that, although monolingual 9‐month‐olds have abstract phonological representations, these representations may not be flexible enough to accommodate the modifications found in foreign‐accented speech.  相似文献   

8.
Barker BA  Newman RS 《Cognition》2004,94(2):B45-B53
Little is known about the acoustic cues infants might use to selectively attend to one talker in the presence of background noise. This study examined the role of talker familiarity as a possible cue. Infants either heard their own mothers (maternal-voice condition) or a different infant's mother (novel-voice condition) repeating isolated words while a female distracter voice spoke fluently in the background. Subsequently, infants heard passages produced by the target voice containing either the familiarized, target words or novel words. Infants in the maternal-voice condition listened significantly longer to the passages containing familiar words; infants in the novel-voice condition showed no preference. These results suggest that infants are able to separate the simultaneous speech of two women when one of the voices is highly familiar to them. However, infants seem to find separating the simultaneous speech of two unfamiliar women extremely difficult.  相似文献   

9.
SPEECH PERCEPTION AS A TALKER-CONTINGENT PROCESS   总被引:2,自引:0,他引:2  
Abstract— To determine how familiarity with a talker's voice affects perception of spoken words, we trained two groups of subjects to recognize a set of voices over a 9-day period One group then identified novel words produced by the same set of talkers at four signal-to-noise ratios Control subjects identified the same words produced by a different set of talkers The results showed that the ability to identify a talker's voice improved intelligibility of novel words produced by that talker The results suggest that speech perception may involve talker-contingent processes whereby perceptual learning of aspects of the vocal source facilitates the subsequent phonetic analysis of the acoustic signal  相似文献   

10.
This study examined infants' abilities to separate speech from different talkers and to recognize a familiar word (the infant's own name) in the context of noise. In 4 experiments, infants heard repetitions of either their names or unfamiliar names in the presence of background babble. Five-month-old infants listened longer to their names when the target voice was 10 dB, but not 5 dB, more intense than the background. Nine-month-olds likewise failed to identify their names at a 5-dB signal-to-noise ratio, but 13-month-olds succeeded. Thus, by 5 months, infants possess some capacity to selectively attend to an interesting voice in the context of competing distractor voices. However, this ability is quite limited and develops further when infants near 1 year of age.  相似文献   

11.
Children’s early word production is influenced by the statistical frequency of speech sounds and combinations. Three experiments asked whether this production effect can be explained by a perceptual learning mechanism that is sensitive to word-token frequency and/or variability. Four-year-olds were exposed to nonwords that were either frequent (presented 10 times) or infrequent (presented once). When the frequent nonwords were spoken by the same talker, children showed no significant effect of perceptual frequency on production. When the frequent nonwords were spoken by different talkers, children produced them with fewer errors and shorter latencies. The results implicate token variability in perceptual learning.  相似文献   

12.
Learning nonnative speech contrasts in adulthood has proven difficult. Standard training methods have achieved moderate effects using explicit instructions and performance feedback. In this study, the authors question preexisting assumptions by demonstrating a superiority of implicit training procedures. They trained 3 groups of Greek adults on a difficult Hindi contrast (a) explicitly, with feedback (Experiment 1), or (b) implicitly, unaware of the phoneme distinctions, with (Experiment 2) or without (Experiment 3) feedback. Stimuli were natural recordings of consonant-vowel syllables with retroflex and dental unvoiced stops by a native Hindi speaker. On each trial, participants heard pairs of tokens from both categories and had to identify the retroflex sounds (explicit condition) or the sounds differing in intensity (implicit condition). Unbeknownst to participants, in the implicit conditions, target sounds were always retroflex, and distractor sounds were always dental. Post-training identification and discrimination tests showed improved performance of all groups, compared with a baseline of untrained Greek listeners. Learning was most robust for implicit training without feedback. It remains to be investigated whether implicitly trained skills can generalize to linguistically relevant phonetic categories when appropriate variability is introduced. These findings challenge traditional accounts on the role of feedback in phonetic training and highlight the importance of implicit, reward-based mechanisms.  相似文献   

13.
Visual information provided by a talker’s mouth movements can influence the perception of certain speech features. Thus, the “McGurk effect” shows that when the syllable /bi/ is presented audibly, in synchrony with the syllable /gi/, as it is presented visually, a person perceives the talker as saying /di/. Moreover, studies have shown that interactions occur between place and voicing features in phonetic perception, when information is presented audibly. In our first experiment, we asked whether feature interactions occur when place information is specified by a combination of auditory and visual information. Members of an auditory continuum ranging from /ibi/ to /ipi/ were paired with a video display of a talker saying /igi/. The auditory tokens were heard as ranging from /ibi/ to /ipi/, but the auditory-visual tokens were perceived as ranging from /idi/ to /iti/. The results demonstrated that the voicing boundary for the auditory-visual tokens was located at a significantly longer VOT value than the voicing boundary for the auditory continuum presented without the visual information. These results demonstrate that place-voice interactions are not limited to situations in which place information is specified audibly. In three follow-up experiments, we show that (1) the voicing boundary is not shifted in the absence of a change in the global percept, even when discrepant auditory-visual information is presented; (2) the number of response alternatives provided for the subjects does not affect the categorization or the VOT boundary of the auditory-visual stimuli; and (3) the original effect of a VOT boundary shift is not replicated when subjects are forced by instruction to \ldrelabel\rd the /b-p/auditory stimuli as/d/or/t/. The subjects successfully relabeled the stimuli, but no shift in the VOT boundary was observed.  相似文献   

14.
Research has shown that speaking rate provides an important context for the perception of certain acoustic properties of speech. For example, syllable duration, which varies as a function of speaking rate, has been shown to influence the perception of voice onset time (VOT) for syllableinitial stop consonants. The purpose of the present experiments was to examine the influence of syllable duration when the initial portion of the syllable was produced by one talker and the remainder of the syllable was produced by a different talker. A short-duration and a long-duration /bi/-/pi/ continuum were synthesized with pitch and formant values appropriate to a female talker. When presented to listeners for identification, these stimuli demonstrated the typical effect of syllable duration on the voicing boundary: a shorter VOT boundary for the short stimuli than for the long stimuli. An /i/ vowel, synthesized with pitch and formant values appropriate to a male talker, was added to the end of each of the short tokens, producing a new hybrid continuum. Although the overall syllable duration of the hybrid stimuli equaled the original long stimuli, they produced a VOT boundary similar to that for the short stimuli. In a second experiment, two new /i/ vowels were synthesized. One had a pitch appropriate to a female talker with formant values appropriate to a male talker; the other had a pitch appropriate to a male talker and formants appropriate to a female talker. These vowels were used to create two new hybrid continua. In a third experiment, new hybrid continua were created by using more extreme male formant values. The results of both experiments demonstrated that the hybrid tokens with a change in pitch acted like the short stimuli, whereas the tokens with a change in formants acted like the long stimuli. A fourth experiment demonstrated that listeners could hear a change in talker with both sets of hybrid tokens. These results indicate that continuity of pitch but not formant structure appears to be the critical factor in the calculation of speaking rate within a syllable.  相似文献   

15.
In their first year, infants begin to learn the speech sounds of their language. This process is typically modeled as an unsupervised clustering problem in which phonetically similar speech‐sound tokens are grouped into phonetic categories by infants using their domain‐general inference abilities. We argue here that maternal speech is too phonetically variable for this account to be plausible, and we provide phonetic evidence from Spanish showing that infant‐directed Spanish vowels are more readily clustered over word types than over vowel tokens. The results suggest that infants’ early adaptation to native‐language phonetics depends on their word‐form lexicon, implicating a much wider range of potential sources of influence on infants’ developmental trajectories in language learning.  相似文献   

16.
English, French, and bilingual English-French 17-month-old infants were compared for their performance on a word learning task using the Switch task. Object names presented a /b/ vs. /g/ contrast that is phonemic in both English and French, and auditory strings comprised English and French pronunciations by an adult bilingual. Infants were habituated to two novel objects labeled 'bowce' or 'gowce' and were then presented with a switch trial where a familiar word and familiar object were paired in a novel combination, and a same trial with a familiar word–object pairing. Bilingual infants looked significantly longer to switch vs. same trials, but English and French monolinguals did not, suggesting that bilingual infants can learn word–object associations when the phonetic conditions favor their input. Monolingual infants likely failed because the bilingual mode of presentation increased phonetic variability and did not match their real-world input. Experiment 2 tested this hypothesis by presenting monolingual infants with nonce word tokens restricted to native language pronunciations. Monolinguals succeeded in this case. Experiment 3 revealed that the presence of unfamiliar pronunciations in Experiment 2, rather than a reduction in overall phonetic variability was the key factor to success, as French infants failed when tested with English pronunciations of the nonce words. Thus phonetic variability impacts how infants perform in the switch task in ways that contribute to differences in monolingual and bilingual performance. Moreover, both monolinguals and bilinguals are developing adaptive speech processing skills that are specific to the language(s) they are learning.  相似文献   

17.
A visual fixation study tested whether 7-month-olds can discriminate between different talkers. The infants were first habituated to talkers producing sentences in either a familiar or unfamiliar language, then heard test sentences from previously unheard speakers, either in the language used for habituation, or in another language. When the language at test mismatched that in habituation, infants always noticed the change. When language remained constant and only talker altered, however, infants detected the change only if the language was the native tongue. Adult listeners with a different native tongue from the infants did not reproduce the discriminability patterns shown by the infants, and infants detected neither voice nor language changes in reversed speech; both these results argue against explanation of the native-language voice discrimination in terms of acoustic properties of the stimuli. The ability to identify talkers is, like many other perceptual abilities, strongly influenced by early life experience.  相似文献   

18.
Infants are often spoken to in the presence of background sounds, including speech from other talkers. In the present study, we compared 5- and 8.5-month-olds’ abilities to recognize their own names in the context of three different types of background speech: that of a single talker, multitalker babble, and that of a single talker played backward. Infants recognized their names at a 10-dB signal-to-noise ratio in the multiple-voice condition but not in the single-voice (nonreversed) condition, a pattern opposite to that of typical adult performance. Infants similarly failed to recognize their names when the background talker’s voice was reversed—that is, unintelligible, but with speech-like acoustic properties. These data suggest that infants may have difficulty segregating the components of different speech streams when those streams are acoustically too similar. Alternatively, infants’ attention may be drawn to the time-varying acoustic properties associated with a single talker’s speech, causing difficulties when a single talker is the competing sound.  相似文献   

19.
In 5 experiments, the authors investigated how listeners learn to recognize unfamiliar talkers and how experience with specific utterances generalizes to novel instances. Listeners were trained over several days to identify 10 talkers from natural, sinewave, or reversed speech sentences. The sinewave signals preserved phonetic and some suprasegmental properties while eliminating natural vocal quality. In contrast, the reversed speech signals preserved vocal quality while distorting temporally based phonetic properties. The training results indicate that listeners learned to identify talkers even from acoustic signals lacking natural vocal quality. Generalization performance varied across the different signals and depended on the salience of phonetic information. The results suggest similarities in the phonetic attributes underlying talker recognition and phonetic perception.  相似文献   

20.
Infants' long-term memory for the phonological patterns of words versus the indexical properties of talkers' voices was examined in 3 experiments using the Headturn Preference Procedure (D. G. Kemler Nelson et al., 1995). Infants were familiarized with repetitions of 2 words and tested on the next day for their orientation times to 4 passages--2 of which included the familiarized words. At 7.5 months of age, infants oriented longer to passages containing familiarized words when these were produced by the original talker. At 7.5 and 10.5 months of age, infants did not recognize words in passages produced by a novel female talker. In contrast, 7.5-month-olds demonstrated word recognition in both talker conditions when presented with passages produced by both the original and the novel talker. The findings suggest that talker-specific information can prime infants' memory for words and facilitate word recognition across talkers.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号