首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 484 毫秒
1.
Visual information provided by a talker’s mouth movements can influence the perception of certain speech features. Thus, the “McGurk effect” shows that when the syllable /bi/ is presented audibly, in synchrony with the syllable /gi/, as it is presented visually, a person perceives the talker as saying /di/. Moreover, studies have shown that interactions occur between place and voicing features in phonetic perception, when information is presented audibly. In our first experiment, we asked whether feature interactions occur when place information is specified by a combination of auditory and visual information. Members of an auditory continuum ranging from /ibi/ to /ipi/ were paired with a video display of a talker saying /igi/. The auditory tokens were heard as ranging from /ibi/ to /ipi/, but the auditory-visual tokens were perceived as ranging from /idi/ to /iti/. The results demonstrated that the voicing boundary for the auditory-visual tokens was located at a significantly longer VOT value than the voicing boundary for the auditory continuum presented without the visual information. These results demonstrate that place-voice interactions are not limited to situations in which place information is specified audibly. In three follow-up experiments, we show that (1) the voicing boundary is not shifted in the absence of a change in the global percept, even when discrepant auditory-visual information is presented; (2) the number of response alternatives provided for the subjects does not affect the categorization or the VOT boundary of the auditory-visual stimuli; and (3) the original effect of a VOT boundary shift is not replicated when subjects are forced by instruction to \ldrelabel\rd the /b-p/auditory stimuli as/d/or/t/. The subjects successfully relabeled the stimuli, but no shift in the VOT boundary was observed.  相似文献   

2.
Visual information provided by a talker's mouth movements can influence the perception of certain speech features. Thus, the "McGurk effect" shows that when the syllable (bi) is presented audibly, in synchrony with the syllable (gi), as it is presented visually, a person perceives the talker as saying (di). Moreover, studies have shown that interactions occur between place and voicing features in phonetic perception, when information is presented audibly. In our first experiment, we asked whether feature interactions occur when place information is specificed by a combination of auditory and visual information. Members of an auditory continuum ranging from (ibi) to (ipi) were paired with a video display of a talker saying (igi). The auditory tokens were heard as ranging from (ibi) to (ipi), but the auditory-visual tokens were perceived as ranging from (idi) to (iti). The results demonstrated that the voicing boundary for the auditory-visual tokens was located at a significantly longer VOT value than the voicing boundary for the auditory continuum presented without the visual information. These results demonstrate that place-voice interactions are not limited to situations in which place information is specified audibly.(ABSTRACT TRUNCATED AT 250 WORDS)  相似文献   

3.
Previous work has demonstrated that the graded internal structure of phonetic categories is sensitive to a variety of contextual factors. One such factor is place of articulation: The best exemplars of voiceless stop consonants along auditory bilabial and velar voice onset time (VOT) continua occur over different ranges of VOTs (Volaitis & Miller, 1992). In the present study, we exploited the McGurk effect to examine whether visual information for place of articulation also shifts the best exemplar range for voiceless consonants, following Green and Kuhl's (1989) demonstration of effects of visual place of articulation on the location of voicing boundaries. In Experiment 1, we established that /p/ and /t/ have different best exemplar ranges along auditory bilabial and alveolar VOT continua. We then found, in Experiment 2, a similar shift in the best-exemplar range for /t/ relative to that for /p/ when there was a change in visual place of articulation, with auditory place of articulation held constant. These findings indicate that the perceptual mechanisms that determine internal phonetic category structure are sensitive to visual, as well as to auditory, information.  相似文献   

4.
Results of auditory speech experiments show that reaction times (RTs) for place classification in a test condition in which stimuli vary along the dimensions of both place and voicing are longer than RTs in a control condition in which stimuli vary only in place. Similar results are obtained when subjects are asked to classify the stimuli along the voicing dimension. By taking advantage of the "McGurk" effect (McGurk & MacDonald, 1976), the present study investigated whether a similar pattern of interference extends to situations in which variation along the place dimension occurs in the visual modality. The results showed that RTs for classifying phonetic features in the test condition were significantly longer than in the control condition for the place and voicing dimensions. These results indicate a mutual and symmetric interference exists in the classification of the two dimensions, even when the variation along the dimensions occurs in separate modalities.  相似文献   

5.
Some reaction time experiments are reported on the relation between the perception and production of phonetic features in speech. Subjects had to produce spoken consonant-vowel syllables rapidly in response to other consonant-vowel stimulus syllables. The stimulus syllables were presented auditorily in one condition and visually in another. Reaction time was measured as a function of the phonetic features shared by the consonants of the stimulus and response syllables. Responses to auditory stimulus syllables were faster when the response syllables started with consonants that had the same voicing feature as those of the stimulus syllables. A shared place-of-articulation feature did not affect the speed of responses to auditory stimulus syllables, even though the place feature was highly salient. For visual stimulus syllables, performance was independent of whether the consonants of the response syllables had the same voicing, same place of articulation, or no shared features. This pattern of results occurred in cases where the syllables contained stop consonants and where they contained fricatives. It held for natural auditory stimuli as well as artificially synthesized ones. The overall data reveal a close relation between the perception and production of voicing features in speech. It does not appear that such a relation exists between perceiving and producing places of articulation. The experiments are relevant to the motor theory of speech perception and to other models of perceptual-motor interactions.  相似文献   

6.
Three experiments are reported on the influence of different timing relations on the McGurk effect. In the first experiment, it is shown that strict temporal synchrony between auditory and visual speech stimuli is not required for the McGurk effect. Subjects were strongly influenced by the visual stimuli when the auditory stimuli lagged the visual stimuli by as much as 180 msec. In addition, a stronger McGurk effect was found when the visual and auditory vowels matched. In the second experiment, we paired auditory and visual speech stimuli produced under different speaking conditions (fast, normal, clear). The results showed that the manipulations in both the visual and auditory speaking conditions independently influenced perception. In addition, there was a small but reliable tendency for the better matched stimuli to elicit more McGurk responses than unmatched conditions. In the third experiment, we combined auditory and visual stimuli produced under different speaking conditions (fast, clear) and delayed the acoustics with respect to the visual stimuli. The subjects showed the same pattern of results as in the second experiment. Finally, the delay did not cause different patterns of results for the different audiovisual speaking style combinations. The results suggest that perceivers may be sensitive to the concordance of the time-varying aspects of speech but they do not require temporal coincidence of that information.  相似文献   

7.
Phoneme identification with audiovisually discrepant stimuli is influenced hy information in the visual signal (the McGurk effect). Additionally, lexical status affects identification of auditorily presented phonemes. The present study tested for lexical influences on the McGurk effect. Participants identified phonemes in audiovisually discrepant stimuli in which lexical status of the auditory component and of a visually influenced percept was independently varied. Visually influenced (McGurk) responses were more frequent when they formed a word and when the auditory signal was a nonword (Experiment 1). Lexical effects were larger for slow than for fast responses (Experiment 2), as with auditory speech, and were replicated with stimuli matched on physical properties (Experiment 3). These results are consistent with models in which lexical processing of speech is modality independent.  相似文献   

8.
The “McGurk effect” demonstrates that visual (lip-read) information is used during speech perception even when it is discrepant with auditory information. While this has been established as a robust effect in subjects from Western cultures, our own earlier results had suggested that Japanese subjects use visual information much less than American subjects do (Sekiyama & Tohkura, 1993). The present study examined whether Chinese subjects would also show a reduced McGurk effect due to their cultural similarities with the Japanese. The subjects were 14 native speakers of Chinese living in Japan. Stimuli consisted of 10 syllables (/ba/, /pa/, /ma/, /wa/, /da/, /ta/, /na/, /ga/, /ka/, /ra/ ) pronounced by two speakers, one Japanese and one American. Each auditory syllable was dubbed onto every visual syllable within one speaker, resulting in 100 audiovisual stimuli in each language. The subjects’ main task was to report what they thought they had heard while looking at and listening to the speaker while the stimuli were being uttered. Compared with previous results obtained with American subjects, the Chinese subjects showed a weaker McGurk effect. The results also showed that the magnitude of the McGurk effect depends on the length of time the Chinese subjects had lived in Japan. Factors that foster and alter the Chinese subjects’ reliance on auditory information are discussed.  相似文献   

9.
Studies of the McGurk effect have shown that when discrepant phonetic information is delivered to the auditory and visual modalities, the information is combined into a new percept not originally presented to either modality. In typical experiments, the auditory and visual speech signals are generated by the same talker. The present experiment examined whether a discrepancy in the gender of the talker between the auditory and visual signals would influence the magnitude of the McGurk effect. A male talker's voice was dubbed onto a videotape containing a female talker's face, and vice versa. The gender-incongruent videotapes were compared with gender-congruent videotapes, in which a male talker's voice was dubbed onto a male face and a female talker's voice was dubbed onto a female face. Even though there was a clear incompatibility in talker characteristics between the auditory and visual signals on the incongruent videotapes, the resulting magnitude of the McGurk effect was not significantly different for the incongruent as opposed to the congruent videotapes. The results indicate that the mechanism for integrating speech information from the auditory and the visual modalities is not disrupted by a gender incompatibility even when it is perceptually apparent. The findings are compatible with the theoretical notion that information about voice characteristics of the talker is extracted and used to normalize the speech signal at an early stage of phonetic processing, prior to the integration of the auditory and the visual information.  相似文献   

10.
In noisy situations, visual information plays a critical role in the success of speech communication: listeners are better able to understand speech when they can see the speaker. Visual influence on auditory speech perception is also observed in the McGurk effect, in which discrepant visual information alters listeners’ auditory perception of a spoken syllable. When hearing /ba/ while seeing a person saying /ga/, for example, listeners may report hearing /da/. Because these two phenomena have been assumed to arise from a common integration mechanism, the McGurk effect has often been used as a measure of audiovisual integration in speech perception. In this study, we test whether this assumed relationship exists within individual listeners. We measured participants’ susceptibility to the McGurk illusion as well as their ability to identify sentences in noise across a range of signal-to-noise ratios in audio-only and audiovisual modalities. Our results do not show a relationship between listeners’ McGurk susceptibility and their ability to use visual cues to understand spoken sentences in noise, suggesting that McGurk susceptibility may not be a valid measure of audiovisual integration in everyday speech processing.  相似文献   

11.
The McGurk effect is usually presented as an example of fast, automatic, multisensory integration. We report a series of experiments designed to directly assess these claims. We used a syllabic version of the speeded classification paradigm, whereby response latencies to the first (target) syllable of spoken word-like stimuli are slowed down when the second (irrelevant) syllable varies from trial to trial. This interference effect is interpreted as a failure of selective attention to filter out the irrelevant syllable. In Experiment 1 we reproduced the syllabic interference effect with bimodal stimuli containing auditory as well as visual lip movement information, thus confirming the generalizability of the phenomenon. In subsequent experiments we were able to produce (Experiment 2) and to eliminate (Experiment 3) syllabic interference by introducing 'illusory' (McGurk) audiovisual stimuli in the irrelevant syllable, suggesting that audiovisual integration occurs prior to attentional selection in this paradigm.  相似文献   

12.
Studies of the McGurk effect have shown that when discrepant phonetic information is delivered to the auditory and visual modalities, the information is combined into a new percept not originally presented to either modality. In typical experiments, the auditory and visual speech signals are generated by the same talker. The present experiment examined whether a discrepancy in the gender of the talker between the auditory and visual signals would influence the magnitude of the McGurk effect. A male talker’s voice was dubbed onto a videotape containing a female talker’s face, and vice versa. The gender-incongruent videotapes were compared with gender-congruent videotapes, in which a male talker’s voice was dubbed onto a male face and a female talker’s voice was dubbed onto a female face. Even though there was a clear incompatibility in talker characteristics between the auditory and visual signals on the incongruent videotapes, the resulting magnitude of the McGurk effectwas not significantly different for the incongruent as opposed to the congruent videotapes. The results indicate that the mechanism for integrating speech information from the auditory and the visual modalities is not disrupted by a gender incompatibility even when it is perceptually apparent. The findings are compatible with the theoretical notion that information about voice characteristics of the talker is extracted and used to normalize the speech signal at an early stage of phonetic processing, prior to the integration of the auditory and the visual information.  相似文献   

13.
The work reported here investigated whether the extent of McGurk effect differs according to the vowel context, and differs when cross‐modal vowels are matched or mismatched in Japanese. Two audio‐visual experiments were conducted to examine the process of audio‐visual phonetic‐feature extraction and integration. The first experiment was designed to compare the extent of the McGurk effect in Japanese in three different vowel contexts. The results indicated that the effect was largest in the /i/ context, moderate in the /a/ context, and almost nonexistent in the /u/ context. This suggests that the occurrence of McGurk effect depends on the characteristics of vowels and the visual cues from their articulation. The second experiment measured the McGurk effect in Japanese with cross‐modal matched and mismatched vowels, and showed that, except with the /u/ sound, the effect was larger when the vowels were matched than when they were mismatched. These results showed, again, that the extent of McGurk effect depends on vowel context and that auditory information processing before phonetic judgment plays an important role in cross‐modal feature integration.  相似文献   

14.
In two studies we investigated the way in which the components of speaking rate, articulation rate and pause rate, combine to influence processing of the silence-duration cue for the voicing distinction in medial stop consonants. First, we replicated the finding that the articulation rate of a carrier sentence, that is, the rate at which the speech itself is produced, influences how the duration information is used to assign voicing values. Second, and more importantly, the assignment of voicing values was also influenced by the pause rate of the sentence. Thus, the listener adjusts for both articulation rate and pause rate when processing the phonetically relevant information. Finally, the two rate components did not function in an equivalent manner, since changes in articulation rate had considerably more effect on phonetic judgments than did changes in pause rate. Alternative explanations fo the relative weighting of the two variables are discussed.  相似文献   

15.
The importance of visual cues in speech perception is illustrated by the McGurk effect, whereby a speaker’s facial movements affect speech perception. The goal of the present study was to evaluate whether the McGurk effect is also observed for sung syllables. Participants heard and saw sung instances of the syllables /ba/ and /ga/ and then judged the syllable they perceived. Audio-visual stimuli were congruent or incongruent (e.g., auditory /ba/ presented with visual /ga/). The stimuli were presented as spoken, sung in an ascending and descending triad (C E G G E C), and sung in an ascending and descending triad that returned to a semitone above the tonic (C E G G E C#). Results revealed no differences in the proportion of fusion responses between spoken and sung conditions confirming that cross-modal phonemic information is integrated similarly in speech and song.  相似文献   

16.
In the McGurk effect, visual information specifying a speaker’s articulatory movements can influence auditory judgments of speech. In the present study, we attempted to find an analogue of the McGurk effect by using nonspeech stimuli—the discrepant audiovisual tokens of plucks and bows on a cello. The results of an initial experiment revealed that subjects’ auditory judgments were influenced significantly by the visual pluck and bow stimuli. However, a second experiment in which speech syllables were used demonstrated that the visual influence on consonants was significantly greater than the visual influence observed for pluck-bow stimuli. This result could be interpreted to suggest that the nonspeech visual influence was not a true McGurk effect. In a third experiment, visual stimuli consisting of the wordspluck andbow were found to have no influence over auditory pluck and bow judgments. This result could suggest that the nonspeech effects found in Experiment 1 were based on the audio and visual information’s having an ostensive lawful relation to the specified event. These results are discussed in terms of motor-theory, ecological, and FLMP approaches to speech perception.  相似文献   

17.
Sætrevik, B. (2010). The influence of visual information on auditory lateralization. Scandinavian Journal of Psychology. The classic McGurk study showed that presentation of one syllable in the visual modality simultaneous with a different syllable in the auditory modality creates the perception of a third, not presented syllable. The current study presented dichotic syllable pairs (one in each ear) simultaneously with video clips of a mouth pronouncing the syllables from one of the ears, or pronouncing a syllable that was not part of the dichotic pair. When asked to report the auditory stimuli, responses were shifted towards selecting the auditory stimulus from the side that matched the visual stimulus.  相似文献   

18.
Fowler, Brown, and Mann (2000) have reported a visually moderated phonetic context effect in which a video disambiguates an acoustically ambiguous precursor syllable, which, in turn, influences perception of a subsequent syllable. In the present experiments, we explored this finding and the claims that stem from it. Experiment 1 failed to replicate Fowler et al. with novel materials modeled after the original study, but Experiment 2 successfully replicated the effect, using Fowler et al.'s stimulus materials. This discrepancy was investigated in Experiments 3 and 4, which demonstrate that variation in visual information concurrent with the test syllable is sufficient to account for the original results. Fowler et al.'s visually moderated phonetic context effect appears to have been a demonstration of audiovisual interaction between concurrent stimuli, and not an effect whereby preceding visual information elicits changes in the perception of subsequent speech sounds.  相似文献   

19.
Three experiments follow up on Easton and Basala's (1982) report that the "McGurk effect" (an influence of a visibly mouthed utterance on a dubbed acoustic one) does not occur when utterances are real words rather than nonsense syllables. In contrast, with real-word stimuli, Easton and Basala report a strong reverse effect whereby a dubbed soundtrack strongly affects identification of lipread words. In Experiment 1, we showed that a strong McGurk effect does obtain when dubbed real words are discrepant with observed words in consonantal place of articulation. A second experiment obtained only a weak reverse effect of dubbed words on judgments of lipread words. A final experiment was designed to provide a sensitive test of effects of lipread words on judgments of heard words and of heard words on judgments of lipread words. The findings reinforced those of the first two experiments that both effects occur, but, with place-of-articulation information discrepant across the modalities, the McGurk effect is strong and the reverse effect weak.  相似文献   

20.
Three experiments follow up on Easton and Basala’s (1982) report that the “McGurk effect” (an influence of a visibly mouthed utterance on a dubbed acoustic one) does not occur when utterances are real words rather than nonsense syllables. In contrast, with real-word stimuli, Easton and Basala report a strong reverse effect whereby a dubbed soundtrack strongly affects identification of lipread words. In Experiment 1, we showed that a strong McGurk effect does obtain when dubbed real words are discrepant with observed words in consonantal place of articulation. A second experiment obtained only a weak reverse effect of dubbed words on judgments of lipread words. A final experiment was designed to provide a sensitive test of effects of lipread words on judgments of heard words and of heard words on judgments of lipread words. The findings reinforced those of the first two experiments that both effects occur, but, with place-of-articulation information discrepant across the modalities, the McGurk effect is strong and the reverse effect weak.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号