期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Use of visual information in speech perception: evidence for a visual rate effect both with and without a McGurk effect

Brancazio L Miller JL 《Perception & psychophysics》2005,67(5):759-769

The McGurk effect, where an incongruent visual syllable influences identification of an auditory syllable, does not always occur, suggesting that perceivers sometimes fail to use relevant visual phonetic information. We tested whether another visual phonetic effect, which involves the influence of visual speaking rate on perceived voicing (Green & Miller, 1985), would occur in instances when the McGurk effect does not. In Experiment 1, we established this visual rate effect using auditory and visual stimuli matching in place of articulation, finding a shift in the voicing boundary along an auditory voice-onset-time continuum with fast versus slow visual speech tokens. In Experiment 2, we used auditory and visual stimuli differing in place of articulation and found a shift in the voicing boundary due to visual rate when the McGurk effect occurred and, more critically, when it did not. The latter finding indicates that phonetically relevant visual information is used in speech perception even when the McGurk effect does not occur, suggesting that the incidence of the McGurk effect underestimates the extent of audio-visual integration. 相似文献

2.

Audiovisual sentence recognition not predicted by susceptibility to the McGurk effect

Kristin J. Van Engen Zilong Xie Bharath Chandrasekaran 《Attention, perception & psychophysics》2017,79(2):396-403

In noisy situations, visual information plays a critical role in the success of speech communication: listeners are better able to understand speech when they can see the speaker. Visual influence on auditory speech perception is also observed in the McGurk effect, in which discrepant visual information alters listeners’ auditory perception of a spoken syllable. When hearing /ba/ while seeing a person saying /ga/, for example, listeners may report hearing /da/. Because these two phenomena have been assumed to arise from a common integration mechanism, the McGurk effect has often been used as a measure of audiovisual integration in speech perception. In this study, we test whether this assumed relationship exists within individual listeners. We measured participants’ susceptibility to the McGurk illusion as well as their ability to identify sentences in noise across a range of signal-to-noise ratios in audio-only and audiovisual modalities. Our results do not show a relationship between listeners’ McGurk susceptibility and their ability to use visual cues to understand spoken sentences in noise, suggesting that McGurk susceptibility may not be a valid measure of audiovisual integration in everyday speech processing. 相似文献

3.

McGurk效应的影响因素与神经基础

罗霄骁康冠兰周晓林《心理科学进展》2018,26(11):1935-1951

McGurk效应(麦格克效应)是典型的视听整合现象, 该效应受到刺激的物理特征、注意分配、个体视听信息依赖程度、视听整合能力、语言文化差异的影响。引发McGurk效应的关键视觉信息主要来自说话者的嘴部区域。产生McGurk效应的认知过程包含早期的视听整合(与颞上皮层有关)以及晚期的视听不一致冲突(与额下皮层有关)。未来研究应关注面孔社会信息对McGurk效应的影响, McGurk效应中单通道信息加工与视听整合的关系, 结合计算模型探讨其认知神经机制等。相似文献

4.

Discrimination tests of visually influenced syllables

Lawrence D. Rosenblum Helena M. Saldaña 《Attention, perception & psychophysics》1992,52(4):461-473

In the McGurk effect, perception of audiovisually discrepant syllables can depend on auditory, visual, or a combination of audiovisual information. Undersome conditions, Vi8Ual information can override auditory information to the extent that identification judgments of a-visually influenced syllable can be as consistent as for an analogous audiovisually compatible syllable. This might indicate that visually influenced and analogous audiuvisually-compatible syllables-are-phictnetically equivalent. Experiments were designed to test this issue using a compelling visually influenced syllable in an AXB matching paradigm. Subjects were asked tomatch an audio syllable /val either to an audiovisually consistent syllable (audio /val-video /fa/) or an audiovisually discrepant syllable (audio /bs/-video ifa!). It was hypothesized that if the two audiovisual syllables were phonetically equivalent, then subjects should choose them equally often in the matching task. Results show, however, that subjects are more likely to match the audio /va/ to the audiovisually consistent /va/, suggesting differences in phonetic convincingness. Additional experiments further suggest that this preference is not based on a phonetically extraneous dimension or on noticeable relative audiovisual discrepancies. 相似文献

5.

Gaze behavior in audiovisual speech perception: the influence of ocular fixations on the McGurk effect

Paré M Richler RC ten Hove M Munhall KG 《Perception & psychophysics》2003,65(4):553-567

We conducted three experiments in order to examine the influence of gaze behavior and fixation on audiovisual speech perception in a task that required subjects to report the speech sound they perceived during the presentation of congruent and incongruent (McGurk) audiovisual stimuli. Experiment 1 showed that the subjects' natural gaze behavior rarely involved gaze fixations beyond the oral and ocular regions of the talker's face and that these gaze fixations did not predict the likelihood of perceiving the McGurk effect. Experiments 2 and 3 showed that manipulation of the subjects' gaze fixations within the talker's face did not influence audiovisual speech perception substantially and that it was not until the gaze was displaced beyond 10 degrees - 20 degrees from the talker's mouth that the McGurk effect was significantly lessened. Nevertheless, the effect persisted under such eccentric viewing conditions and became negligible only when the subject's gaze was directed 60 degrees eccentrically. These findings demonstrate that the analysis of high spatial frequency information afforded by direct oral foveation is not necessary for the successful processing of visual speech information. 相似文献

6.

Psychophysics of the McGurk and other audiovisual speech integration effects

Jiang J Bernstein LE 《Journal of experimental psychology. Human perception and performance》2011,37(4):1193-1209

When the auditory and visual components of spoken audiovisual nonsense syllables are mismatched, perceivers produce four different types of perceptual responses, auditory correct, visual correct, fusion (the so-called McGurk effect), and combination (i.e., two consonants are reported). Here, quantitative measures were developed to account for the distribution of the four types of perceptual responses to 384 different stimuli from four talkers. The measures included mutual information, correlations, and acoustic measures, all representing audiovisual stimulus relationships. In Experiment 1, open-set perceptual responses were obtained for acoustic /bɑ/ or /lɑ/ dubbed to video /bɑ, dɑ, gɑ, vɑ, zɑ, lɑ, wɑ, eɑ/. The talker, the video syllable, and the acoustic syllable significantly influenced the type of response. In Experiment 2, the best predictors of response category proportions were a subset of the physical stimulus measures, with the variance accounted for in the perceptual response category proportions between 17% and 52%. That audiovisual stimulus relationships can account for perceptual response distributions supports the possibility that internal representations are based on modality-specific stimulus relationships. 相似文献

7.

Processing of audiovisually congruent and incongruent speech in school‐age children with a history of specific language impairment: a behavioral and event‐related potentials study

Natalya Kaganovich Jennifer Schumaker Danielle Macias Dana Gustafson 《Developmental science》2015,18(5):751-770

Previous studies indicate that at least some aspects of audiovisual speech perception are impaired in children with specific language impairment (SLI). However, whether audiovisual processing difficulties are also present in older children with a history of this disorder is unknown. By combining electrophysiological and behavioral measures, we examined perception of both audiovisually congruent and audiovisually incongruent speech in school‐age children with a history of SLI (H‐SLI), their typically developing (TD) peers, and adults. In the first experiment, all participants watched videos of a talker articulating syllables ‘ba’, ‘da’, and ‘ga’ under three conditions – audiovisual (AV), auditory only (A), and visual only (V). The amplitude of the N1 (but not of the P2) event‐related component elicited in the AV condition was significantly reduced compared to the N1 amplitude measured from the sum of the A and V conditions in all groups of participants. Because N1 attenuation to AV speech is thought to index the degree to which facial movements predict the onset of the auditory signal, our findings suggest that this aspect of audiovisual speech perception is mature by mid‐childhood and is normal in the H‐SLI children. In the second experiment, participants watched videos of audivisually incongruent syllables created to elicit the so‐called McGurk illusion (with an auditory ‘pa’ dubbed onto a visual articulation of ‘ka’, and the expectant perception being that of ‘ta’ if audiovisual integration took place). As a group, H‐SLI children were significantly more likely than either TD children or adults to hear the McGurk syllable as ‘pa’ (in agreement with its auditory component) than as ‘ka’ (in agreement with its visual component), suggesting that susceptibility to the McGurk illusion is reduced in at least some children with a history of SLI. Taken together, the results of the two experiments argue against global audiovisual integration impairment in children with a history of SLI and suggest that, when present, audiovisual integration difficulties in this population likely stem from a later (non‐sensory) stage of processing. 相似文献

8.

Cultural and linguistic factors in audiovisual speech processing: The McGurk effect in Chinese subjects

Kaoru Sekiyama 《Attention, perception & psychophysics》1997,59(1):73-80

The “McGurk effect” demonstrates that visual (lip-read) information is used during speech perception even when it is discrepant with auditory information. While this has been established as a robust effect in subjects from Western cultures, our own earlier results had suggested that Japanese subjects use visual information much less than American subjects do (Sekiyama & Tohkura, 1993). The present study examined whether Chinese subjects would also show a reduced McGurk effect due to their cultural similarities with the Japanese. The subjects were 14 native speakers of Chinese living in Japan. Stimuli consisted of 10 syllables (/ba/, /pa/, /ma/, /wa/, /da/, /ta/, /na/, /ga/, /ka/, /ra/ ) pronounced by two speakers, one Japanese and one American. Each auditory syllable was dubbed onto every visual syllable within one speaker, resulting in 100 audiovisual stimuli in each language. The subjects’ main task was to report what they thought they had heard while looking at and listening to the speaker while the stimuli were being uttered. Compared with previous results obtained with American subjects, the Chinese subjects showed a weaker McGurk effect. The results also showed that the magnitude of the McGurk effect depends on the length of time the Chinese subjects had lived in Japan. Factors that foster and alter the Chinese subjects’ reliance on auditory information are discussed. 相似文献

9.

Temporal constraints on the McGurk effect

K. G. Munhall P. Gribble L. Sacco M. Ward 《Attention, perception & psychophysics》1996,58(3):351-362

Three experiments are reported on the influence of different timing relations on the McGurk effect. In the first experiment, it is shown that strict temporal synchrony between auditory and visual speech stimuli is not required for the McGurk effect. Subjects were strongly influenced by the visual stimuli when the auditory stimuli lagged the visual stimuli by as much as 180 msec. In addition, a stronger McGurk effect was found when the visual and auditory vowels matched. In the second experiment, we paired auditory and visual speech stimuli produced under different speaking conditions (fast, normal, clear). The results showed that the manipulations in both the visual and auditory speaking conditions independently influenced perception. In addition, there was a small but reliable tendency for the better matched stimuli to elicit more McGurk responses than unmatched conditions. In the third experiment, we combined auditory and visual stimuli produced under different speaking conditions (fast, clear) and delayed the acoustics with respect to the visual stimuli. The subjects showed the same pattern of results as in the second experiment. Finally, the delay did not cause different patterns of results for the different audiovisual speaking style combinations. The results suggest that perceivers may be sensitive to the concordance of the time-varying aspects of speech but they do not require temporal coincidence of that information. 相似文献

10.

A critical evaluation of visually moderated phonetic context effects

Holt LL Stephens JD Lotto AJ 《Perception & psychophysics》2005,67(6):1102-1112

Fowler, Brown, and Mann (2000) have reported a visually moderated phonetic context effect in which a video disambiguates an acoustically ambiguous precursor syllable, which, in turn, influences perception of a subsequent syllable. In the present experiments, we explored this finding and the claims that stem from it. Experiment 1 failed to replicate Fowler et al. with novel materials modeled after the original study, but Experiment 2 successfully replicated the effect, using Fowler et al.'s stimulus materials. This discrepancy was investigated in Experiments 3 and 4, which demonstrate that variation in visual information concurrent with the test syllable is sufficient to account for the original results. Fowler et al.'s visually moderated phonetic context effect appears to have been a demonstration of audiovisual interaction between concurrent stimuli, and not an effect whereby preceding visual information elicits changes in the perception of subsequent speech sounds. 相似文献

11.

Discrimination tests of visually influenced syllables.

L D Rosenblum H M Salda?a 《Perception & psychophysics》1992,52(4):461-473

In the McGurk effect, perception of audiovisually discrepant syllables can depend on auditory, visual, or a combination of audiovisual information. Under some conditions, visual information can override auditory information to the extent that identification judgments of a visually influenced syllable can be as consistent as for an analogous audiovisually compatible syllable. This might indicate that visually influenced and analogous audiovisually compatible syllables are phonetically equivalent. Experiments were designed to test this issue using a compelling visually influenced syllable in an AXB matching paradigm. Subjects were asked to match an audio syllable/va/either to an audiovisually consistent syllable (audio/va/-video/fa/) or an audiovisually discrepant syllable (audio/ba/-video/fa/). It was hypothesized that if the two audiovisual syllables were phonetically equivalent, then subjects should choose them equally often in the matching task. Results show, however, that subjects are more likely to match the audio/va/ to the audiovisually consistent/va/, suggesting differences in phonetic convincingness. Additional experiments further suggest that this preference is not based on a phonetically extraneous dimension or on noticeable relative audiovisual discrepancies. 相似文献

12.

Masked priming effects with syllabic neighbors in a lexical decision task

Carreiras M Perea M 《Journal of experimental psychology. Human perception and performance》2002,28(5):1228-1242

Four lexical decision experiments using a masked priming paradigm were conducted to analyze whether the previous presentation of a syllabic neighbor (a word sharing the same 1st syllable) influences recognition performance. The results showed an inhibitory effect of more frequent syllabic primes and some facilitation of nonword syllabic primes (Experiments 1-3). When monosyllabic pairs were used (Experiment 3), no priming effects of the 2 initial letters were found. Finally, when using only syllables as primes, latencies to words were shorter when preceded by primes that corresponded to the 1st syllable than by primes that contained 1 letter more or less than the 1st syllable (Experiment 4). Results are interpreted using activation models that take into account a syllabic level of representation. 相似文献

13.

McGurk stimuli for the investigation of multisensory integration in cochlear implant users: The Oldenburg Audio Visual Speech Stimuli (OLAVS)

Maren Stropahl Sebastian Schellhardt Stefan Debener 《Psychonomic bulletin & review》2017,24(3):863-872

The concurrent presentation of different auditory and visual syllables may result in the perception of a third syllable, reflecting an illusory fusion of visual and auditory information. This well-known McGurk effect is frequently used for the study of audio-visual integration. Recently, it was shown that the McGurk effect is strongly stimulus-dependent, which complicates comparisons across perceivers and inferences across studies. To overcome this limitation, we developed the freely available Oldenburg audio-visual speech stimuli (OLAVS), consisting of 8 different talkers and 12 different syllable combinations. The quality of the OLAVS set was evaluated with 24 normal-hearing subjects. All 96 stimuli were characterized based on their stimulus disparity, which was obtained from a probabilistic model (cf. Magnotti & Beauchamp, 2015). Moreover, the McGurk effect was studied in eight adult cochlear implant (CI) users. By applying the individual, stimulus-independent parameters of the probabilistic model, the predicted effect of stronger audio-visual integration in CI users could be confirmed, demonstrating the validity of the new stimulus material. 相似文献

14.

Can we see syllables in monosyllabic words? A study with illusory conjunctions

Nadège Doignon-Camus Daniel Zagar Stéphanie Mathey 《Journal of Cognitive Psychology》2013,25(4):599-614

Mathey, Zagar, Doignon, and Seigneuric (2006) reported an inhibitory effect of syllabic neighbourhood in monosyllabic French words suggesting that syllable units mediate the access to lexical representations of monosyllabic stimuli. Two experiments were conducted to investigate the perception of syllable units in monosyllabic stimuli. The illusory conjunction paradigm was used to examine perceptual groupings of letters. Experiment 1 showed that potential syllables in monosyllabic French words (e.g., BI in BICHE) affected the pattern of illusory conjunctions. Experiment 2 indicated that the perceptual parsing in monosyllabic items was due to syllable information and orthographic redundancy. The implications of the data are discussed for visual word recognition processes in an interactive activation model incorporating syllable units and connected adjacent letters (IAS; Mathey et al., 2006). 相似文献

15.

Crossmodal duration perception involves perceptual grouping, temporal ventriloquism, and variable internal clock rates

Klink PC Montijn JS van Wezel RJ 《Attention, perception & psychophysics》2011,73(1):219-236

Here, we investigate how audiovisual context affects perceived event duration with experiments in which observers reported which of two stimuli they perceived as longer. Target events were visual and/or auditory and could be accompanied by nontargets in the other modality. Our results demonstrate that the temporal information conveyed by irrelevant sounds is automatically used when the brain estimates visual durations but that irrelevant visual information does not affect perceived auditory duration (Experiment 1). We further show that auditory influences on subjective visual durations occur only when the temporal characteristics of the stimuli promote perceptual grouping (Experiments 1 and 2). Placed in the context of scalar expectancy theory of time perception, our third and fourth experiments have the implication that audiovisual context can lead both to changes in the rate of an internal clock and to temporal ventriloquism-like effects on perceived on- and offsets. Finally, intramodal grouping of auditory stimuli diminished any crossmodal effects, suggesting a strong preference for intramodal over crossmodal perceptual grouping (Experiment 5). 相似文献

16.

Visual influences on auditory pluck and bow judgments

Helena M. Saldaña Lawrence D. Rosenblum 《Attention, perception & psychophysics》1993,54(3):406-416

In the McGurk effect, visual information specifying a speaker’s articulatory movements can influence auditory judgments of speech. In the present study, we attempted to find an analogue of the McGurk effect by using nonspeech stimuli—the discrepant audiovisual tokens of plucks and bows on a cello. The results of an initial experiment revealed that subjects’ auditory judgments were influenced significantly by the visual pluck and bow stimuli. However, a second experiment in which speech syllables were used demonstrated that the visual influence on consonants was significantly greater than the visual influence observed for pluck-bow stimuli. This result could be interpreted to suggest that the nonspeech visual influence was not a true McGurk effect. In a third experiment, visual stimuli consisting of the wordspluck andbow were found to have no influence over auditory pluck and bow judgments. This result could suggest that the nonspeech effects found in Experiment 1 were based on the audio and visual information’s having an ostensive lawful relation to the specified event. These results are discussed in terms of motor-theory, ecological, and FLMP approaches to speech perception. 相似文献

17.

Negative priming depends on ease of selection

Eric Ruthruff Jeff Miller 《Attention, perception & psychophysics》1995,57(5):715-723

Negative priming effects have been offered as evidence that distractor stimuli are identified. We conducted two experiments to determine if such effects occur even when it is easy to discriminate target from distractor stimuli. In Experiment 1, we found the usual negative priming effect when target and distractor positions varied from trial to trial, but not when these positions remained fixed. Experiment 2 extended these results to a situation where the ease of selection varied only in the prime display. These findings argue that irrelevant inputs can be filtered out prior to stimulus identification under certain circumstances and therefore pose problems for strict late selection theories. 相似文献

18.

Processing segmental and prosodic information in Cantonese word production

Wong AW Chen HC 《Journal of experimental psychology. Learning, memory, and cognition》2008,34(5):1172-1190

Five experiments were conducted to investigate how subsyllabic, syllabic, and prosodic information is processed in Cantonese monosyllabic word production. A picture-word interference task was used in which a target picture and a distractor word were presented simultaneously or sequentially. In the first 3 experiments with visually presented distractors, null effects on naming latencies were found when the distractor and the picture name shared the onset, the rhyme, the tone, or both the onset and tone. However, significant facilitation effects were obtained when the target and the distractor shared the rhyme + tone (Experiment 2), the segmental syllable (Experiment 3), or the syllable + tone (Experiment 3). Similar results were found in Experiments 4 and 5 with spoken rather than visual distractors. Moreover, a significant facilitation effect was observed in the rhyme-related condition in Experiment 5, and this effect was not affected by the degree of phonological overlap between the target and the distractor. These results are interpreted in an interactive model, which allows feedback sending from the subsyllabic to the lexical level during the phonological encoding stage in Cantonese word production. 相似文献

19.

The McGurk effect in infants

Lawrence D. Rosenblum Mark A. Schmuckler Jennifer A. Johnson 《Attention, perception & psychophysics》1997,59(3):347-357

In the McGurk effect, perceptual identification of auditory speech syllables is influenced by simultaneous presentation of discrepant visible speech syllables. This effect has been found in subjects of different ages and with various native language backgrounds. But no McGurk tests have been conducted with prelinguistic infants. In the present series of experiments, 5-month-old English-exposed infants were tested for the McGurk effect. Infants were first gaze-habituated to an audiovisual /va/. Two different dishabituation stimuli were then presented: audio /ba/-visual /va/ (perceived by adults as /va/), and audio /da/-visual /va/ (perceived by adults as /da/). The infants showed generalization from the audiovisual /va/ to the audio /ba/-visual /va/ stimulus but not to the audio /da/-visual /va/ stimulus. Follow-up experiments revealed that these generalization differences were not due to a general preference for the audio /da/-visual /va/ stimulus or to the auditory similarity of /ba/ to /va/ relative to /da/. These results suggest that the infants were visually influenced in the same way as Englishspeaking adults are visually influenced. 相似文献

20.

内源性空间线索有效性对视听觉整合的影响

唐晓雨吴英楠彭姓王爱君李奇《心理学报》2020,52(7):835-846

采用内源性线索-靶子范式, 操纵线索类型(有效线索、无效线索)和靶刺激通道类型(视觉刺激、听觉刺激、视听觉刺激)两个自变量, 通过两个实验, 分别设置50%和80%两种内源性空间线索有效性来考察不同空间线索有效性条件下内源性空间注意对视听觉整合的影响。结果发现, 当线索有效性为50%时(实验1), 有效线索位置和无效线索位置的视听觉整合效应没有显著差异; 当线索有效性为80%时(实验2), 有效线索位置的视听觉整合效应显著大于无效线索位置的视听觉整合效应。结果表明, 线索有效性不同时, 内源性空间注意对视听觉整合产生了不同的影响, 高线索有效性条件下内源性空间注意能够促进视听觉整合效应。相似文献