期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Audiovisual integration in perception of real words.

D J Dekle C A Fowler M G Funnell 《Perception & psychophysics》1992,51(4):355-362

Three experiments follow up on Easton and Basala's (1982) report that the "McGurk effect" (an influence of a visibly mouthed utterance on a dubbed acoustic one) does not occur when utterances are real words rather than nonsense syllables. In contrast, with real-word stimuli, Easton and Basala report a strong reverse effect whereby a dubbed soundtrack strongly affects identification of lipread words. In Experiment 1, we showed that a strong McGurk effect does obtain when dubbed real words are discrepant with observed words in consonantal place of articulation. A second experiment obtained only a weak reverse effect of dubbed words on judgments of lipread words. A final experiment was designed to provide a sensitive test of effects of lipread words on judgments of heard words and of heard words on judgments of lipread words. The findings reinforced those of the first two experiments that both effects occur, but, with place-of-articulation information discrepant across the modalities, the McGurk effect is strong and the reverse effect weak. 相似文献

2.

Cultural and linguistic factors in audiovisual speech processing: The McGurk effect in Chinese subjects

Kaoru Sekiyama 《Attention, perception & psychophysics》1997,59(1):73-80

The “McGurk effect” demonstrates that visual (lip-read) information is used during speech perception even when it is discrepant with auditory information. While this has been established as a robust effect in subjects from Western cultures, our own earlier results had suggested that Japanese subjects use visual information much less than American subjects do (Sekiyama & Tohkura, 1993). The present study examined whether Chinese subjects would also show a reduced McGurk effect due to their cultural similarities with the Japanese. The subjects were 14 native speakers of Chinese living in Japan. Stimuli consisted of 10 syllables (/ba/, /pa/, /ma/, /wa/, /da/, /ta/, /na/, /ga/, /ka/, /ra/ ) pronounced by two speakers, one Japanese and one American. Each auditory syllable was dubbed onto every visual syllable within one speaker, resulting in 100 audiovisual stimuli in each language. The subjects’ main task was to report what they thought they had heard while looking at and listening to the speaker while the stimuli were being uttered. Compared with previous results obtained with American subjects, the Chinese subjects showed a weaker McGurk effect. The results also showed that the magnitude of the McGurk effect depends on the length of time the Chinese subjects had lived in Japan. Factors that foster and alter the Chinese subjects’ reliance on auditory information are discussed. 相似文献

3.

Integrating speech information across talkers, gender, and sensory modality: female faces and male voices in the McGurk effect. 总被引：2，自引：0，他引：2

K P Green P K Kuhl A N Meltzoff E B Stevens 《Perception & psychophysics》1991,50(6):524-536

Studies of the McGurk effect have shown that when discrepant phonetic information is delivered to the auditory and visual modalities, the information is combined into a new percept not originally presented to either modality. In typical experiments, the auditory and visual speech signals are generated by the same talker. The present experiment examined whether a discrepancy in the gender of the talker between the auditory and visual signals would influence the magnitude of the McGurk effect. A male talker's voice was dubbed onto a videotape containing a female talker's face, and vice versa. The gender-incongruent videotapes were compared with gender-congruent videotapes, in which a male talker's voice was dubbed onto a male face and a female talker's voice was dubbed onto a female face. Even though there was a clear incompatibility in talker characteristics between the auditory and visual signals on the incongruent videotapes, the resulting magnitude of the McGurk effect was not significantly different for the incongruent as opposed to the congruent videotapes. The results indicate that the mechanism for integrating speech information from the auditory and the visual modalities is not disrupted by a gender incompatibility even when it is perceptually apparent. The findings are compatible with the theoretical notion that information about voice characteristics of the talker is extracted and used to normalize the speech signal at an early stage of phonetic processing, prior to the integration of the auditory and the visual information. 相似文献

4.

Integrating speech information across talkers,gender, and sensory modality: Female faces and male voices in the McGurk effect

Kerry P. Green Patricia K. Kuhl Andrew N. Meltzoff Erica B. Stevens 《Attention, perception & psychophysics》1991,50(6):524-536

Studies of the McGurk effect have shown that when discrepant phonetic information is delivered to the auditory and visual modalities, the information is combined into a new percept not originally presented to either modality. In typical experiments, the auditory and visual speech signals are generated by the same talker. The present experiment examined whether a discrepancy in the gender of the talker between the auditory and visual signals would influence the magnitude of the McGurk effect. A male talker’s voice was dubbed onto a videotape containing a female talker’s face, and vice versa. The gender-incongruent videotapes were compared with gender-congruent videotapes, in which a male talker’s voice was dubbed onto a male face and a female talker’s voice was dubbed onto a female face. Even though there was a clear incompatibility in talker characteristics between the auditory and visual signals on the incongruent videotapes, the resulting magnitude of the McGurk effectwas not significantly different for the incongruent as opposed to the congruent videotapes. The results indicate that the mechanism for integrating speech information from the auditory and the visual modalities is not disrupted by a gender incompatibility even when it is perceptually apparent. The findings are compatible with the theoretical notion that information about voice characteristics of the talker is extracted and used to normalize the speech signal at an early stage of phonetic processing, prior to the integration of the auditory and the visual information. 相似文献

5.

A sex difference in visual influence on heard speech

Irwin JR Whalen DH Fowler CA 《Perception & psychophysics》2006,68(4):582-592

Reports of sex differences in language processing are inconsistent and are thought to vary by task type and difficulty. In two experiments, we investigated a sex difference in visual influence onheard speech (the McGurk effect). First, incongruent consonant-vowel stimuli were presented where the visual portion of the signal was brief (100 msec) or full (temporally equivalent to the auditory). Second, to determine whether men and women differed in their ability to extract visual speech information from these brief stimuli, the same stimuli were presented to new participants with an additional visual-only (lipread) condition. In both experiments, women showed a significantly greater visual influence on heard speech than did men for the brief visual stimuli. No sex differences for the full stimuli or in the ability to lipread were found. These findings indicate that the more challenging brief visual stimuli elicit sex differences in the processing of audiovisual speech. 相似文献

6.

Visual influences on auditory pluck and bow judgments

Helena M. Saldaña Lawrence D. Rosenblum 《Attention, perception & psychophysics》1993,54(3):406-416

In the McGurk effect, visual information specifying a speaker’s articulatory movements can influence auditory judgments of speech. In the present study, we attempted to find an analogue of the McGurk effect by using nonspeech stimuli—the discrepant audiovisual tokens of plucks and bows on a cello. The results of an initial experiment revealed that subjects’ auditory judgments were influenced significantly by the visual pluck and bow stimuli. However, a second experiment in which speech syllables were used demonstrated that the visual influence on consonants was significantly greater than the visual influence observed for pluck-bow stimuli. This result could be interpreted to suggest that the nonspeech visual influence was not a true McGurk effect. In a third experiment, visual stimuli consisting of the wordspluck andbow were found to have no influence over auditory pluck and bow judgments. This result could suggest that the nonspeech effects found in Experiment 1 were based on the audio and visual information’s having an ostensive lawful relation to the specified event. These results are discussed in terms of motor-theory, ecological, and FLMP approaches to speech perception. 相似文献

7.

Perception of intersensory synchrony in audiovisual speech: not that special

Vroomen J Stekelenburg JJ 《Cognition》2011,(1):75-83

Perception of intersensory temporal order is particularly difficult for (continuous) audiovisual speech, as perceivers may find it difficult to notice substantial timing differences between speech sounds and lip movements. Here we tested whether this occurs because audiovisual speech is strongly paired (“unity assumption”). Participants made temporal order judgments (TOJ) and simultaneity judgments (SJ) about sine-wave speech (SWS) replicas of pseudowords and the corresponding video of the face. Listeners in speech and non-speech mode were equally sensitive judging audiovisual temporal order. Yet, using the McGurk effect, we could demonstrate that the sound was more likely integrated with lipread speech if heard as speech than non-speech. Judging temporal order in audiovisual speech is thus unaffected by whether the auditory and visual streams are paired. Conceivably, previously found differences between speech and non-speech stimuli are not due to the putative “special” nature of speech, but rather reflect low-level stimulus differences. 相似文献

8.

Listening with eye and hand: cross-modal contributions to speech perception.

C A Fowler D J Dekle 《Journal of experimental psychology. Human perception and performance》1991,17(3):816-828

Three experiments investigated the "McGurk effect" whereby optically specified syllables experienced synchronously with acoustically specified syllables integrate in perception to determine a listener's auditory perceptual experience. Experiments contrasted the cross-modal effect of orthographic on acoustic syllables presumed to be associated in experience and memory with that of haptically experienced and acoustic syllables presumed not to be associated. The latter pairing gave rise to cross-modal influences when Ss were informed that cross-modal syllables were paired independently. Mouthed syllables affected reports of simultaneously heard syllables (and vice versa). These effects were absent when syllables were simultaneously seen (spelled) and heard. The McGurk effect does not arise from association in memory but from conjoint near specification of the same causal source in the environment--in speech, the moving vocal tract producing phonetic gestures. 相似文献

9.

The role of visual information in the processing of

Kerry P. Green Patricia K. Kuhl 《Attention, perception & psychophysics》1989,45(1):34-42

Visual information provided by a talker’s mouth movements can influence the perception of certain speech features. Thus, the “McGurk effect” shows that when the syllable /bi/ is presented audibly, in synchrony with the syllable /gi/, as it is presented visually, a person perceives the talker as saying /di/. Moreover, studies have shown that interactions occur between place and voicing features in phonetic perception, when information is presented audibly. In our first experiment, we asked whether feature interactions occur when place information is specified by a combination of auditory and visual information. Members of an auditory continuum ranging from /ibi/ to /ipi/ were paired with a video display of a talker saying /igi/. The auditory tokens were heard as ranging from /ibi/ to /ipi/, but the auditory-visual tokens were perceived as ranging from /idi/ to /iti/. The results demonstrated that the voicing boundary for the auditory-visual tokens was located at a significantly longer VOT value than the voicing boundary for the auditory continuum presented without the visual information. These results demonstrate that place-voice interactions are not limited to situations in which place information is specified audibly. In three follow-up experiments, we show that (1) the voicing boundary is not shifted in the absence of a change in the global percept, even when discrepant auditory-visual information is presented; (2) the number of response alternatives provided for the subjects does not affect the categorization or the VOT boundary of the auditory-visual stimuli; and (3) the original effect of a VOT boundary shift is not replicated when subjects are forced by instruction to \ldrelabel\rd the /b-p/auditory stimuli as/d/or/t/. The subjects successfully relabeled the stimuli, but no shift in the VOT boundary was observed. 相似文献

10.

Modalities of memory: Is reading lips like hearing voices?

David W. Maidment Bill MackenDylan M. Jones 《Cognition》2013

Functional similarities in verbal memory performance across presentation modalities (written, heard, lipread) are often taken to point to a common underlying representational form upon which the modalities converge. We show here instead that the pattern of performance depends critically on presentation modality and different mechanisms give rise to superficially similar effects across modalities. Lipread recency is underpinned by different mechanisms to auditory recency, and while the effect of an auditory suffix on an auditory list is due to the perceptual grouping of the suffix with the list, the corresponding effect with lipread speech is due to misidentification of the lexical content of the lipread suffix. Further, while a lipread suffix does not disrupt auditory recency, an auditory suffix does disrupt recency for lipread lists. However, this effect is due to attentional capture ensuing from the presentation of an unexpected auditory event, and is evident both with verbal and nonverbal auditory suffixes. These findings add to a growing body of evidence that short-term verbal memory performance is determined by modality-specific perceptual and motor processes, rather than by the storage and manipulation of phonological representations. 相似文献

11.

Writing and long-term memory: Evidence for a “translation” hypothesis

Martin A. Conway Susan E. Gathercole 《The Quarterly Journal of Experimental Psychology Section A: Human Experimental Psychology》1990,42(3):513-527

Two experiments investigated the effects of writing upon memory. In the first experiment an incidental learning procedure was employed: One group of subjects read words silently and wrote visually presented words, and a second group of subjects listened to auditorily presented words and wrote heard words. Recognition of heard words was substantially enhanced by writing, whereas the effect of writing on memory for read words was less powerful. A second experiment employing an intentional learning procedure replicated these findings and demonstrated the robustness of the beneficial consequences of writing on memory for heard words. These findings are conceptualized within a framework that proposes that translations between specialized processing domains that occur at encoding lead to the formation of distinctive memories and, hence, to better retention. 相似文献

12.

Lexical influences in audiovisual speech perception

Brancazio L Brancazio L 《Journal of experimental psychology. Human perception and performance》2004,30(3):445-463

Phoneme identification with audiovisually discrepant stimuli is influenced hy information in the visual signal (the McGurk effect). Additionally, lexical status affects identification of auditorily presented phonemes. The present study tested for lexical influences on the McGurk effect. Participants identified phonemes in audiovisually discrepant stimuli in which lexical status of the auditory component and of a visually influenced percept was independently varied. Visually influenced (McGurk) responses were more frequent when they formed a word and when the auditory signal was a nonword (Experiment 1). Lexical effects were larger for slow than for fast responses (Experiment 2), as with auditory speech, and were replicated with stimuli matched on physical properties (Experiment 3). These results are consistent with models in which lexical processing of speech is modality independent. 相似文献

13.

Audiovisual sentence recognition not predicted by susceptibility to the McGurk effect

Kristin J. Van Engen Zilong Xie Bharath Chandrasekaran 《Attention, perception & psychophysics》2017,79(2):396-403

In noisy situations, visual information plays a critical role in the success of speech communication: listeners are better able to understand speech when they can see the speaker. Visual influence on auditory speech perception is also observed in the McGurk effect, in which discrepant visual information alters listeners’ auditory perception of a spoken syllable. When hearing /ba/ while seeing a person saying /ga/, for example, listeners may report hearing /da/. Because these two phenomena have been assumed to arise from a common integration mechanism, the McGurk effect has often been used as a measure of audiovisual integration in speech perception. In this study, we test whether this assumed relationship exists within individual listeners. We measured participants’ susceptibility to the McGurk illusion as well as their ability to identify sentences in noise across a range of signal-to-noise ratios in audio-only and audiovisual modalities. Our results do not show a relationship between listeners’ McGurk susceptibility and their ability to use visual cues to understand spoken sentences in noise, suggesting that McGurk susceptibility may not be a valid measure of audiovisual integration in everyday speech processing. 相似文献

14.

The role of verbal labels in the judgment of orientation and location

R H Maki L G Braine 《Perception》1985,14(1):67-80

In an earlier study it was found that judgments of right-left orientations and locations were more difficult than judgments of up-down only when spatial words were used in the tasks. Experiments are reported in which pictures of many objects were presented to eliminate the possibility that subjects in previous studies had used strategies specific to single-stimulus tasks. In experiment 1, right-left orientations were judged more slowly than up-down orientations both when the spatial words were used and when arbitrary letters replaced the spatial words. In experiment 2, judgments of the right-left locations of pictures took longer than judgments of their up-down locations only when spatial words were used in the task; the right-left difficulty was eliminated when arbitrary letters replaced the words. The differential effect of words and letters in location judgments seems to be due to the different coding strategies adopted by subjects under the two conditions. It is concluded that a right-left difficulty does not depend on the use of spatial terms: word and letter conditions yield different results only when the task permits different judgments to be made under the two conditions. 相似文献

15.

Use of visual information in speech perception: evidence for a visual rate effect both with and without a McGurk effect

Brancazio L Miller JL 《Perception & psychophysics》2005,67(5):759-769

The McGurk effect, where an incongruent visual syllable influences identification of an auditory syllable, does not always occur, suggesting that perceivers sometimes fail to use relevant visual phonetic information. We tested whether another visual phonetic effect, which involves the influence of visual speaking rate on perceived voicing (Green & Miller, 1985), would occur in instances when the McGurk effect does not. In Experiment 1, we established this visual rate effect using auditory and visual stimuli matching in place of articulation, finding a shift in the voicing boundary along an auditory voice-onset-time continuum with fast versus slow visual speech tokens. In Experiment 2, we used auditory and visual stimuli differing in place of articulation and found a shift in the voicing boundary due to visual rate when the McGurk effect occurred and, more critically, when it did not. The latter finding indicates that phonetically relevant visual information is used in speech perception even when the McGurk effect does not occur, suggesting that the incidence of the McGurk effect underestimates the extent of audio-visual integration. 相似文献

16.

Influence of vowel context on the audio‐visual speech perception of voiced stop consonants

Sumi Shigeno 《The Japanese psychological research》2000,42(3):155-167

The work reported here investigated whether the extent of McGurk effect differs according to the vowel context, and differs when cross‐modal vowels are matched or mismatched in Japanese. Two audio‐visual experiments were conducted to examine the process of audio‐visual phonetic‐feature extraction and integration. The first experiment was designed to compare the extent of the McGurk effect in Japanese in three different vowel contexts. The results indicated that the effect was largest in the /i/ context, moderate in the /a/ context, and almost nonexistent in the /u/ context. This suggests that the occurrence of McGurk effect depends on the characteristics of vowels and the visual cues from their articulation. The second experiment measured the McGurk effect in Japanese with cross‐modal matched and mismatched vowels, and showed that, except with the /u/ sound, the effect was larger when the vowels were matched than when they were mismatched. These results showed, again, that the extent of McGurk effect depends on vowel context and that auditory information processing before phonetic judgment plays an important role in cross‐modal feature integration. 相似文献

17.

Discrimination tests of visually influenced syllables.

L D Rosenblum H M Salda?a 《Perception & psychophysics》1992,52(4):461-473

In the McGurk effect, perception of audiovisually discrepant syllables can depend on auditory, visual, or a combination of audiovisual information. Under some conditions, visual information can override auditory information to the extent that identification judgments of a visually influenced syllable can be as consistent as for an analogous audiovisually compatible syllable. This might indicate that visually influenced and analogous audiovisually compatible syllables are phonetically equivalent. Experiments were designed to test this issue using a compelling visually influenced syllable in an AXB matching paradigm. Subjects were asked to match an audio syllable/va/either to an audiovisually consistent syllable (audio/va/-video/fa/) or an audiovisually discrepant syllable (audio/ba/-video/fa/). It was hypothesized that if the two audiovisual syllables were phonetically equivalent, then subjects should choose them equally often in the matching task. Results show, however, that subjects are more likely to match the audio/va/ to the audiovisually consistent/va/, suggesting differences in phonetic convincingness. Additional experiments further suggest that this preference is not based on a phonetically extraneous dimension or on noticeable relative audiovisual discrepancies. 相似文献

18.

Discrimination tests of visually influenced syllables

Lawrence D. Rosenblum Helena M. Saldaña 《Attention, perception & psychophysics》1992,52(4):461-473

In the McGurk effect, perception of audiovisually discrepant syllables can depend on auditory, visual, or a combination of audiovisual information. Undersome conditions, Vi8Ual information can override auditory information to the extent that identification judgments of a-visually influenced syllable can be as consistent as for an analogous audiovisually compatible syllable. This might indicate that visually influenced and analogous audiuvisually-compatible syllables-are-phictnetically equivalent. Experiments were designed to test this issue using a compelling visually influenced syllable in an AXB matching paradigm. Subjects were asked tomatch an audio syllable /val either to an audiovisually consistent syllable (audio /val-video /fa/) or an audiovisually discrepant syllable (audio /bs/-video ifa!). It was hypothesized that if the two audiovisual syllables were phonetically equivalent, then subjects should choose them equally often in the matching task. Results show, however, that subjects are more likely to match the audio /va/ to the audiovisually consistent /va/, suggesting differences in phonetic convincingness. Additional experiments further suggest that this preference is not based on a phonetically extraneous dimension or on noticeable relative audiovisual discrepancies. 相似文献

19.

Visual attention modulates audiovisual speech perception

K. Tiippana T. S. Andersen M. Sams 《Journal of Cognitive Psychology》2013,25(3):457-472

Speech perception is audiovisual, as demonstrated by the McGurk effect in which discrepant visual speech alters the auditory speech percept. We studied the role of visual attention in audiovisual speech perception by measuring the McGurk effect in two conditions. In the baseline condition, attention was focused on the talking face. In the distracted attention condition, subjects ignored the face and attended to a visual distractor, which was a leaf moving across the face. The McGurk effect was weaker in the latter condition, indicating that visual attention modulated audiovisual speech perception. This modulation may occur at an early, unisensory processing stage, or it may be due to changes at the stage where auditory and visual information is integrated. We investigated this issue by conventional statistical testing, and by fitting the Fuzzy Logical Model of Perception (Massaro, 1998) to the results. The two methods suggested different interpretations, revealing a paradox in the current methods of analysis. 相似文献

20.

Contrast effects do not underlie effects of preceding liquids on stop-consonant identification by humans

Fowler CA Brown JM Mann VA 《Journal of experimental psychology. Human perception and performance》2000,26(3):877-888

These experiments explored the claim by A. Lotto and K. Kluender (1998) that frequency contrast explains listeners' compensations for coarticulation in the case of liquid consonants coarticulating with following stops. Evidence of frequency contrast in experiments that tested for it directly was not found, but Lotto and Kluender's finding that high- and low-frequency precursor tones can produce contrastive effects on stop-consonant judgments were replicated. The effect depends on the amplitude relation of the tones to the third formant (F3) of the stops. This implies that the tones mask F3 information in the stop consonants. It is unknown whether liquids and following stops in natural speech are in an appropriate intensity relation for masking of the stop. A final experiment, exploiting the McGurk effect, showed compensation for coarticulation by listeners when neither frequency contrast nor masking can be the source of the compensations. 相似文献