首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 265 毫秒
1.
We investigated whether the “unity assumption,” according to which an observer assumes that two different sensory signals refer to the same underlying multisensory event, influences the multisensory integration of audiovisual speech stimuli. Syllables (Experiments 1, 3, and 4) or words (Experiment 2) were presented to participants at a range of different stimulus onset asynchronies using the method of constant stimuli. Participants made unspeeded temporal order judgments regarding which stream (either auditory or visual) had been presented first. The auditory and visual speech stimuli in Experiments 1–3 were either gender matched (i.e., a female face presented together with a female voice) or else gender mismatched (i.e., a female face presented together with a male voice). In Experiment 4, different utterances from the same female speaker were used to generate the matched and mismatched speech video clips. Measuring in terms of the just noticeable difference the participants in all four experiments found it easier to judge which sensory modality had been presented first when evaluating mismatched stimuli than when evaluating the matched-speech stimuli. These results therefore provide the first empirical support for the “unity assumption” in the domain of the multisensory temporal integration of audiovisual speech stimuli.  相似文献   

2.
Audiovisual integration (AVI) has been demonstrated to play a major role in speech comprehension. Previous research suggests that AVI in speech comprehension tolerates a temporal window of audiovisual asynchrony. However, few studies have employed audiovisual presentation to investigate AVI in person recognition. Here, participants completed an audiovisual voice familiarity task in which the synchrony of the auditory and visual stimuli was manipulated, and in which visual speaker identity could be corresponding or noncorresponding to the voice. Recognition of personally familiar voices systematically improved when corresponding visual speakers were presented near synchrony or with slight auditory lag. Moreover, when faces of different familiarity were presented with a voice, recognition accuracy suffered at near synchrony to slight auditory lag only. These results provide the first evidence for a temporal window for AVI in person recognition between approximately 100 ms auditory lead and 300 ms auditory lag.  相似文献   

3.
Audio-visual simultaneity judgments   总被引:3,自引:0,他引:3  
The relative spatiotemporal correspondence between sensory events affects multisensory integration across a variety of species; integration is maximal when stimuli in different sensory modalities are presented from approximately the same position at about the same time. In the present study, we investigated the influence of spatial and temporal factors on audio-visual simultaneity perception in humans. Participants made unspeeded simultaneous versus successive discrimination responses to pairs of auditory and visual stimuli presented at varying stimulus onset asynchronies from either the same or different spatial positions using either the method of constant stimuli (Experiments 1 and 2) or psychophysical staircases (Experiment 3). The participants in all three experiments were more likely to report the stimuli as being simultaneous when they originated from the same spatial position than when they came from different positions, demonstrating that the apparent perception of multisensory simultaneity is dependent on the relative spatial position from which stimuli are presented.  相似文献   

4.
When participants judge multimodal audiovisual stimuli, the auditory information strongly dominates temporal judgments, whereas the visual information dominates spatial judgments. However, temporal judgments are not independent of spatial features. For example, in the kappa effect, the time interval between two marker stimuli appears longer when they originate from spatially distant sources rather than from the same source. We investigated the kappa effect for auditory markers presented with accompanying irrelevant visual stimuli. The spatial sources of the markers were varied such that they were either congruent or incongruent across modalities. In two experiments, we demonstrated that the spatial layout of the visual stimuli affected perceived auditory interval duration. This effect occurred although the visual stimuli were designated to be task-irrelevant for the duration reproduction task in Experiment 1, and even when the visual stimuli did not contain sufficient temporal information to perform a two-interval comparison task in Experiment 2. We conclude that the visual and auditory marker stimuli were integrated into a combined multisensory percept containing temporal as well as task-irrelevant spatial aspects of the stimulation. Through this multisensory integration process, visuospatial information affected even temporal judgments, which are typically dominated by the auditory modality.  相似文献   

5.
Vatakis, A. and Spence, C. (in press) [Crossmodal binding: Evaluating the 'unity assumption' using audiovisual speech stimuli. Perception &Psychophysics] recently demonstrated that when two briefly presented speech signals (one auditory and the other visual) refer to the same audiovisual speech event, people find it harder to judge their temporal order than when they refer to different speech events. Vatakis and Spence argued that the 'unity assumption' facilitated crossmodal binding on the former (matching) trials by means of a process of temporal ventriloquism. In the present study, we investigated whether the 'unity assumption' would also affect the binding of non-speech stimuli (video clips of object action or musical notes). The auditory and visual stimuli were presented at a range of stimulus onset asynchronies (SOAs) using the method of constant stimuli. Participants made unspeeded temporal order judgments (TOJs) regarding which modality stream had been presented first. The auditory and visual musical and object action stimuli were either matched (e.g., the sight of a note being played on a piano together with the corresponding sound) or else mismatched (e.g., the sight of a note being played on a piano together with the sound of a guitar string being plucked). However, in contrast to the results of Vatakis and Spence's recent speech study, no significant difference in the accuracy of temporal discrimination performance for the matched versus mismatched video clips was observed. Reasons for this discrepancy are discussed.  相似文献   

6.
康冠兰  罗霄骁 《心理科学》2020,(5):1072-1078
多通道信息交互是指来自某个感觉通道的信息与另一感觉通道的信息相互作用、相互影响的一系列加工过程。主要包括两个方面:一是不同感觉通道的输入如何整合;二是跨通道信息的冲突控制。本文综述了视听跨通道信息整合与冲突控制的行为心理机制和神经机制,探讨了注意对视听信息整合与冲突控制的影响。未来需探究视听跨通道信息加工的脑网络机制,考察特殊群体的跨通道整合和冲突控制以帮助揭示其认知和社会功能障碍的机制。  相似文献   

7.
王润洲  毕鸿燕 《心理科学进展》2022,30(12):2764-2776
发展性阅读障碍的本质一直是研究者争论的焦点。大量研究发现, 阅读障碍者具有视听时间整合缺陷。然而, 这些研究仅考察了阅读障碍者视听时间整合加工的整体表现, 也就是平均水平的表现, 却对整合加工的变化过程缺乏探讨。视听时间再校准反映了视听时间整合的动态加工过程, 对内部时间表征与感觉输入之间差异的再校准困难则会导致多感觉整合受损, 而阅读障碍者的再校准相关能力存在缺陷。因此, 视听时间再校准能力受损可能是发展性阅读障碍视听时间整合缺陷的根本原因。未来的研究需要进一步考察发展性阅读障碍者视听时间再校准能力的具体表现, 以及这些表现背后的认知神经机制。  相似文献   

8.
本研究分别在时间和情绪认知维度上考察预先准备效应对情绪视听整合的影响。时间辨别任务(实验1)发现视觉引导显著慢于听觉引导,并且整合效应量为负值。情绪辨别任务(实验2)发现整合效应量为正值;在负性情绪整合中,听觉引导显著大于视觉引导;在正性情绪整合中,视觉引导显著大于听觉引导。研究表明,情绪视听整合基于情绪认知加工,而时间辨别会抑制整合;此外,跨通道预先准备效应和情绪预先准备效应都与引导通道有关。  相似文献   

9.
Previous studies of multisensory integration have often stressed the beneficial effects that may arise when information concerning an event arrives via different sensory modalities at the same time, as, for example, exemplified by research on the redundant target effect (RTE). By contrast, studies of the Colavita visual dominance effect (e.g., [Colavita, F. B. (1974). Human sensory dominance. Perception & Psychophysics, 16, 409–412]) highlight the inhibitory consequences of the competition between signals presented simultaneously in different sensory modalities instead. Although both the RTE and the Colavita effect are thought to occur at early sensory levels and the stimulus conditions under which they are typically observed are very similar, the interplay between these two opposing behavioural phenomena (facilitation vs. competition) has yet to be addressed empirically. We hypothesized that the dissociation may reflect two of the fundamentally different ways in which humans can perceive concurrent auditory and visual stimuli. In Experiment 1, we demonstrated both multisensory facilitation (RTE) and the Colavita visual dominance effect using exactly the same audiovisual displays, by simply changing the task from a speeded detection task to a speeded modality discrimination task. Meanwhile, in Experiment 2, the participants exhibited multisensory facilitation when responding to visual targets and multisensory inhibition when responding to auditory targets while keeping the task constant. These results therefore indicate that both multisensory facilitation and inhibition can be demonstrated in reaction to the same bimodal event.  相似文献   

10.
Vatakis A  Spence C 《Perception》2008,37(1):143-160
Research has shown that inversion is more detrimental to the perception of faces than to the perception of other types of visual stimuli. Inverting a face results in an impairment of configural information processing that leads to slowed early face processing and reduced accuracy when performance is tested in face recognition tasks. We investigated the effects of inverting speech and non-speech stimuli on audiovisual temporal perception. Upright and inverted audiovisual video clips of a person uttering syllables (experiments 1 and 2), playing musical notes on a piano (experiment 3), or a rhesus monkey producing vocalisations (experiment 4) were presented. Participants made unspeeded temporal-order judgments regarding which modality stream (auditory or visual) appeared to have been presented first. Inverting the visual stream did not have any effect on the sensitivity of temporal discrimination responses in any of the four experiments, thus implying that audiovisual temporal integration is resilient to the effects of orientation in the picture plane. By contrast, the point of subjective simultaneity differed significantly as a function of orientation only for the audiovisual speech stimuli but not for the non-speech stimuli or monkey calls. That is, smaller auditory leads were required for the inverted than for the upright-visual speech stimuli. These results are consistent with the longer processing latencies reported previously when human faces are inverted and demonstrates that the temporal perception of dynamic audiovisual speech can be modulated by changes in the physical properties of the visual speech (ie by changes in orientation).  相似文献   

11.
McCotter MV  Jordan TR 《Perception》2003,32(8):921-936
We conducted four experiments to investigate the role of colour and luminance information in visual and audiovisual speech perception. In experiments 1a (stimuli presented in quiet conditions) and 1b (stimuli presented in auditory noise), face display types comprised naturalistic colour (NC), grey-scale (GS), and luminance inverted (LI) faces. In experiments 2a (quiet) and 2b (noise), face display types comprised NC, colour inverted (CI), LI, and colour and luminance inverted (CLI) faces. Six syllables and twenty-two words were used to produce auditory and visual speech stimuli. Auditory and visual signals were combined to produce congruent and incongruent audiovisual speech stimuli. Experiments 1a and 1b showed that perception of visual speech, and its influence on identifying the auditory components of congruent and incongruent audiovisual speech, was less for LI than for either NC or GS faces, which produced identical results. Experiments 2a and 2b showed that perception of visual speech, and influences on perception of incongruent auditory speech, was less for LI and CLI faces than for NC and CI faces (which produced identical patterns of performance). Our findings for NC and CI faces suggest that colour is not critical for perception of visual and audiovisual speech. The effect of luminance inversion on performance accuracy was relatively small (5%), which suggests that the luminance information preserved in LI faces is important for the processing of visual and audiovisual speech.  相似文献   

12.
We report a 53-year-old patient (AWF) who has an acquired deficit of audiovisual speech integration, characterized by a perceived temporal mismatch between speech sounds and the sight of moving lips. AWF was less accurate on an auditory digit span task with vision of a speaker's face as compared to a condition in which no visual information from the lower face was available. He was slower in matching words to pictures when he saw congruent lip movements compared to no lip movements or non-speech lip movements. Unlike normal controls, he showed no McGurk effect. We propose that multisensory binding of audiovisual language cues can be selectively disrupted.  相似文献   

13.
We constantly integrate the information that is available to our various senses. The extent to which the mechanisms of multisensory integration are subject to the influences of attention, emotion, and/or motivation is currently unknown. The ??ventriloquist effect?? is widely assumed to be an automatic crossmodal phenomenon, shifting the perceived location of an auditory stimulus toward a concurrently presented visual stimulus. In the present study, we examined whether audiovisual binding, as indicated by the magnitude of the ventriloquist effect, is influenced by threatening auditory stimuli presented prior to the ventriloquist experiment. Syllables spoken in a fearful voice were presented from one of eight loudspeakers, while syllables spoken in a neutral voice were presented from the other seven locations. Subsequently, participants had to localize pure tones while trying to ignore concurrent visual stimuli (both the auditory and the visual stimuli here were emotionally neutral). A reliable ventriloquist effect was observed. The emotional stimulus manipulation resulted in a reduction of the magnitude of the subsequently measured ventriloquist effect in both hemifields, as compared to a control group exposed to a similar attention-capturing, but nonemotional, manipulation. These results suggest that the emotional system is capable of influencing multisensory binding processes that have heretofore been considered automatic.  相似文献   

14.
We propose a measure of audiovisual speech integration that takes into account accuracy and response times. This measure should prove beneficial for researchers investigating multisensory speech recognition, since it relates to normal-hearing and aging populations. As an example, age-related sensory decline influences both the rate at which one processes information and the ability to utilize cues from different sensory modalities. Our function assesses integration when both auditory and visual information are available, by comparing performance on these audiovisual trials with theoretical predictions for performance under the assumptions of parallel, independent self-terminating processing of single-modality inputs. We provide example data from an audiovisual identification experiment and discuss applications for measuring audiovisual integration skills across the life span.  相似文献   

15.
Perception of intersensory temporal order is particularly difficult for (continuous) audiovisual speech, as perceivers may find it difficult to notice substantial timing differences between speech sounds and lip movements. Here we tested whether this occurs because audiovisual speech is strongly paired (“unity assumption”). Participants made temporal order judgments (TOJ) and simultaneity judgments (SJ) about sine-wave speech (SWS) replicas of pseudowords and the corresponding video of the face. Listeners in speech and non-speech mode were equally sensitive judging audiovisual temporal order. Yet, using the McGurk effect, we could demonstrate that the sound was more likely integrated with lipread speech if heard as speech than non-speech. Judging temporal order in audiovisual speech is thus unaffected by whether the auditory and visual streams are paired. Conceivably, previously found differences between speech and non-speech stimuli are not due to the putative “special” nature of speech, but rather reflect low-level stimulus differences.  相似文献   

16.
We live in a world rich in sensory information, and consequently the brain is challenged with deciphering which cues from the various sensory modalities belong together. Determinations regarding the relatedness of sensory information appear to be based, at least in part, on the spatial and temporal relationships between the stimuli. Stimuli that are presented in close spatial and temporal correspondence are more likely to be associated with one another and thus 'bound' into a single perceptual entity. While there is a robust literature delineating behavioral changes in perception induced by multisensory stimuli, maturational changes in multisensory processing, particularly in the temporal realm, are poorly understood. The current study examines the developmental progression of multisensory temporal function by analyzing responses on an audiovisual simultaneity judgment task in 6- to 23-year-old participants. The overarching hypothesis for the study was that multisensory temporal function will mature with increasing age, with the developmental trajectory for this change being the primary point of inquiry. Results indeed reveal an age-dependent decrease in the size of the 'multisensory temporal binding window', the temporal interval within which multisensory stimuli are likely to be perceptually bound, with changes occurring over a surprisingly protracted time course that extends into adolescence.  相似文献   

17.
Voice is the carrier of speech but is also an "auditory face" rich in information on the speaker's identity and affective state. Three experiments explored the possibility of a "voice inversion effect," by analogy to the classical "face inversion effect," which could support the hypothesis of a voice-specific module. Experiment 1 consisted of a gender identification task on two syllables pronounced by 90 speakers (boys, girls, men, and women). Experiment 2 consisted of a speaker discrimination task on pairs of syllables (8 men and 8 women). Experiment 3 consisted of an instrument discrimination task on pairs of melodies (8 string and 8 wind instruments). In all three experiments, stimuli were presented in 4 conditions: (1) no inversion; (2) temporal inversion (e.g., backwards speech); (3) frequency inversion centered around 4000 Hz; and (4) around 2500 Hz. Results indicated a significant decrease in performance caused by sound inversion, with a much stronger effect for frequency than for temporal inversion. Interestingly, although frequency inversion markedly affected timbre for both voices and instruments, subjects' performance was still above chance. However, performance at instrument discrimination was much higher than for voices, preventing comparison of inversion effects for voices vs. non-vocal stimuli. Additional experiments will be necessary to conclude on the existence of a possible "voice inversion effect."  相似文献   

18.
This study investigated multisensory interactions in the perception of auditory and visual motion. When auditory and visual apparent motion streams are presented concurrently in opposite directions, participants often fail to discriminate the direction of motion of the auditory stream, whereas perception of the visual stream is unaffected by the direction of auditory motion (Experiment 1). This asymmetry persists even when the perceived quality of apparent motion is equated for the 2 modalities (Experiment 2). Subsequently, it was found that this visual modulation of auditory motion is caused by an illusory reversal in the perceived direction of sounds (Experiment 3). This "dynamic capture" effect occurs over and above ventriloquism among static events (Experiments 4 and 5), and it generalizes to continuous motion displays (Experiment 6). These data are discussed in light of related multisensory phenomena and their support for a "modality appropriateness" interpretation of multisensory integration in motion perception.  相似文献   

19.
In Experiment 1, participants were presented with pairs of stimuli (one visual and the other tactile) from the left and/or right of fixation at varying stimulus onset asynchronies and were required to make unspeeded temporal order judgments (TOJs) regarding which modality was presented first. When the participants adopted an uncrossed-hands posture, just noticeable differences (JNDs) were lower (i.e., multisensory TOJs were more precise) when stimuli were presented from different positions, rather than from the same position. This spatial redundancy benefit was reduced when the participants adopted a crossed-hands posture, suggesting a failure to remap visuotactile space appropriately. In Experiment 2, JNDs were also lower when pairs of auditory and visual stimuli were presented from different positions, rather than from the same position. Taken together, these results demonstrate that people can use redundant spatial cues to facilitate their performance on multisensory TOJ tasks and suggest that previous studies may have systematically overestimated the precision with which people can make such judgments. These results highlight the intimate link between spatial and temporal factors in determining our perception of the multimodal objects and events in the world around us.  相似文献   

20.
Multisensory integration, the binding of sensory information from different sensory modalities, may contribute to perceptual symptomatology in schizophrenia, including hallucinations and aberrant speech perception. Differences in multisensory integration and temporal processing, an important component of multisensory integration, are consistently found in schizophrenia. Evidence is emerging that these differences extend across the schizophrenia spectrum, including individuals in the general population with higher schizotypal traits. In the current study, we investigated the relationship between schizotypal traits and perceptual functioning, using audiovisual speech-in-noise, McGurk, and ternary synchrony judgment tasks. We measured schizotypal traits using the Schizotypal Personality Questionnaire (SPQ), hypothesizing that higher scores on Unusual Perceptual Experiences and Odd Speech subscales would be associated with decreased multisensory integration, increased susceptibility to distracting auditory speech, and less precise temporal processing. Surprisingly, these measures were not associated with the predicted subscales, suggesting that these perceptual differences may not be present across the schizophrenia spectrum.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号