首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Perception of emotion is critical for successful social interaction, yet the neural mechanisms underlying the perception of dynamic, audio-visual emotional cues are poorly understood. Evidence from language and sensory paradigms suggests that the superior temporal sulcus and gyrus (STS/STG) play a key role in the integration of auditory and visual cues. Emotion perception research has focused on static facial cues; however, dynamic audio-visual (AV) cues mimic real-world social cues more accurately than static and/or unimodal stimuli. Novel dynamic AV stimuli were presented using a block design in two fMRI studies, comparing bimodal stimuli to unimodal conditions, and emotional to neutral stimuli. Results suggest that the bilateral superior temporal region plays distinct roles in the perception of emotion and in the integration of auditory and visual cues. Given the greater ecological validity of the stimuli developed for this study, this paradigm may be helpful in elucidating the deficits in emotion perception experienced by clinical populations.  相似文献   

2.
In face-to-face conversation speech is perceived by ear and eye. We studied the prerequisites of audio-visual speech perception by using perceptually ambiguous sine wave replicas of natural speech as auditory stimuli. When the subjects were not aware that the auditory stimuli were speech, they showed only negligible integration of auditory and visual stimuli. When the same subjects learned to perceive the same auditory stimuli as speech, they integrated the auditory and visual stimuli in a similar manner as natural speech. These results demonstrate the existence of a multisensory speech-specific mode of perception.  相似文献   

3.
Integration of simultaneous auditory and visual information about an event can enhance our ability to detect that event. This is particularly evident in the perception of speech, where the articulatory gestures of the speaker's lips and face can significantly improve the listener's detection and identification of the message, especially when that message is presented in a noisy background. Speech is a particularly important example of multisensory integration because of its behavioural relevance to humans and also because brain regions have been identified that appear to be specifically tuned for auditory speech and lip gestures. Previous research has suggested that speech stimuli may have an advantage over other types of auditory stimuli in terms of audio-visual integration. Here, we used a modified adaptive psychophysical staircase approach to compare the influence of congruent visual stimuli (brief movie clips) on the detection of noise-masked auditory speech and non-speech stimuli. We found that congruent visual stimuli significantly improved detection of an auditory stimulus relative to incongruent visual stimuli. This effect, however, was equally apparent for speech and non-speech stimuli. The findings suggest that speech stimuli are not specifically advantaged by audio-visual integration for detection at threshold when compared with other naturalistic sounds.  相似文献   

4.
A perception of coherent motion can be obtained in an otherwise ambiguous or illusory visual display by directing one's attention to a feature and tracking it. We demonstrate an analogous auditory effect in two separate sets of experiments. The temporal dynamics associated with the attention-dependent auditory motion closely matched those previously reported for attention-based visual motion. Since attention-based motion mechanisms appear to exist in both modalities, we also tested for multimodal (audiovisual) attention-based motion, using stimuli composed of interleaved visual and auditory cues. Although subjects were able to track a trajectory using cues from both modalities, no one spontaneously perceived "multimodal motion" across both visual and auditory cues. Rather, they reported motion perception only within each modality, thereby revealing a spatiotemporal limit on putative cross-modal motion integration. Together, results from these experiments demonstrate the existence of attention-based motion in audition, extending current theories of attention-based mechanisms from visual to auditory systems.  相似文献   

5.
The goal of this study was to evaluate the claim that young children display preferences for auditory stimuli over visual stimuli. This study was motivated by concerns that the visual stimuli employed in prior studies were considerably more complex and less distinctive than the competing auditory stimuli, resulting in an illusory preference for auditory cues. Across three experiments, preschool-age children and adults were trained to use paired audio-visual cues to predict the location of a target. At test, the cues were switched so that auditory cues indicated one location and visual cues indicated the opposite location. In contrast to prior studies, preschool-age children did not exhibit auditory dominance. Instead, children and adults flexibly shifted their preferences as a function of the degree of contrast within each modality, with high contrast leading to greater use.  相似文献   

6.
本研究分别在时间和情绪认知维度上考察预先准备效应对情绪视听整合的影响。时间辨别任务(实验1)发现视觉引导显著慢于听觉引导,并且整合效应量为负值。情绪辨别任务(实验2)发现整合效应量为正值;在负性情绪整合中,听觉引导显著大于视觉引导;在正性情绪整合中,视觉引导显著大于听觉引导。研究表明,情绪视听整合基于情绪认知加工,而时间辨别会抑制整合;此外,跨通道预先准备效应和情绪预先准备效应都与引导通道有关。  相似文献   

7.
When auditory stimuli are used in two-dimensional spatial compatibility tasks, where the stimulus and response configurations vary along the horizontal and vertical dimensions simultaneously, a right–left prevalence effect occurs in which horizontal compatibility dominates over vertical compatibility. The right–left prevalence effects obtained with auditory stimuli are typically larger than that obtained with visual stimuli even though less attention should be demanded from the horizontal dimension in auditory processing. In the present study, we examined whether auditory or visual dominance occurs when the two-dimensional stimuli are audiovisual, as well as whether there will be cross-modal facilitation of response selection for the horizontal and vertical dimensions. We also examined whether there is an additional benefit of adding a pitch dimension to the auditory stimulus to facilitate vertical coding through use of the spatial-musical association of response codes (SMARC) effect, where pitch is coded in terms of height in space. In Experiment 1, we found a larger right–left prevalence effect for unimodal auditory than visual stimuli. Neutral, non-pitch coded, audiovisual stimuli did not result in cross-modal facilitation, but did show evidence of visual dominance. The right–left prevalence effect was eliminated in the presence of SMARC audiovisual stimuli, but the effect influenced horizontal rather than vertical coding. Experiment 2 showed that the influence of the pitch dimension was not in terms of influencing response selection on a trial-to-trial basis, but in terms of altering the salience of the task environment. Taken together, these findings indicate that in the absence of salient vertical cues, auditory and audiovisual stimuli tend to be coded along the horizontal dimension and vision tends to dominate audition in this two-dimensional spatial stimulus–response task.  相似文献   

8.
The presence of congruent visual speech information facilitates the identification of auditory speech, while the addition of incongruent visual speech information often impairs accuracy. This latter arrangement occurs naturally when one is being directly addressed in conversation but listens to a different speaker. Under these conditions, performance may diminish since: (a) one is bereft of the facilitative effects of the corresponding lip motion and (b) one becomes subject to visual distortion by incongruent visual speech; by contrast, speech intelligibility may be improved due to (c) bimodal localization of the central unattended stimulus. Participants were exposed to centrally presented visual and auditory speech while attending to a peripheral speech stream. In some trials, the lip movements of the central visual stimulus matched the unattended speech stream; in others, the lip movements matched the attended peripheral speech. Accuracy for the peripheral stimulus was nearly one standard deviation greater with incongruent visual information, compared to the congruent condition which provided bimodal pattern recognition cues. Likely, the bimodal localization of the central stimulus further differentiated the stimuli and thus facilitated intelligibility. Results are discussed with regard to similar findings in an investigation of the ventriloquist effect, and the relative strength of localization and speech cues in covert listening.  相似文献   

9.
The effect of audiovisual interactions on size perception has yet to be examined, despite its fundamental importance in daily life. Previous studies have reported that object length can be estimated solely on the basis of the sounds produced when an object is dropped. Moreover, it has been shown that people typically and easily perceive the correspondence between object sizes and sound intensities. It is therefore possible that auditory stimuli may act as cues for object size, thereby altering the visual perception of size. Thus, in the present study we examined the effects of auditory stimuli on the visual perception of size. Specifically, we investigated the effects of the sound intensity of auditory stimuli, the temporal window of audiovisual interactions, and the effects of the retinal eccentricity of visual stimuli. The results indicated that high-intensity auditory stimuli increased visually perceived object size, and that this effect was especially strong in the peripheral visual field. Additional consideration indicated that this effect on the visual perception of size is induced when the cue reliability is relatively higher for the auditory than for the visual stimuli. In addition, we further suggest that the cue reliabilities of visual and auditory stimuli relate to retinal eccentricity and sound intensity, respectively.  相似文献   

10.
Previous research has shown that irrelevant sounds can facilitate the perception of visual apparent motion. Here the effectiveness of a single sound to facilitate motion perception was investigated in three experiments. Observers were presented with two discrete lights temporally separated by stimulus onset asynchronies from 0 to 350 ms. After each trial, observers classified their impression of the stimuli using a categorisation system. A short sound presented temporally (and spatially) midway between the lights facilitated the impression of motion relative to baseline (lights without sound), whereas a sound presented either before the first or after the second light or simultaneously with the lights did not affect motion impression. The facilitation effect also occurred with sound presented far from the visual display, as well as with continuous-sound that was started with the first light and terminated with the second light. No facilitation of visual motion perception occurred if the sound was part of a tone sequence that allowed for intramodal perceptual grouping of the auditory stimuli prior to the critical audiovisual stimuli. Taken together, the findings are consistent with a low-level audiovisual integration approach in which the perceptual system merges temporally proximate sound and light stimuli, thereby provoking the impression of a single multimodal moving object.  相似文献   

11.
Tracking an audio-visual target involves integrating spatial cues about target position from both modalities. Such sensory cue integration is a developmental process in the brain involving learning, with neuroplasticity as its underlying mechanism. We present a Hebbian learning-based adaptive neural circuit for multi-modal cue integration. The circuit temporally correlates stimulus cues within each modality via intramodal learning as well as symmetrically across modalities via crossmodal learning to independently update modality-specific neural weights on a sample-by-sample basis. It is realised as a robotic agent that must orient towards a moving audio-visual target. It continuously learns the best possible weights required for a weighted combination of auditory and visual spatial target directional cues that is directly mapped to robot wheel velocities to elicit an orientation response. Visual directional cues are noise-free and continuous but arising from a relatively narrow receptive field while auditory directional cues are noisy and intermittent but arising from a relatively wider receptive field. Comparative trials in simulation demonstrate that concurrent intramodal learning improves both the overall accuracy and precision of the orientation responses of symmetric crossmodal learning. We also demonstrate that symmetric crossmodal learning improves multisensory responses as compared to asymmetric crossmodal learning. The neural circuit also exhibits multisensory effects such as sub-additivity, additivity and super-additivity.  相似文献   

12.
In three experiments, listeners were required to either localize or identify the second of two successive sounds. The first sound (the cue) and the second sound (the target) could originate from either the same or different locations, and the interval between the onsets of the two sounds (Stimulus Onset Asynchrony, SOA) was varied. Sounds were presented out of visual range at 135 azimuth left or right. In Experiment 1, localization responses were made more quickly at 100 ms SOA when the target sounded from the same location as the cue (i.e., a facilitative effect), and at 700 ms SOA when the target and cue sounded from different locations (i.e., an inhibitory effect). In Experiments 2 and 3, listeners were required to monitor visual information presented directly in front of them at the same time as the auditory cue and target were presented behind them. These two experiments differed in that in order to perform the visual task accurately in Experiment 3, eye movements to visual stimuli were required. In both experiments, a transition from facilitation at a brief SOA to inhibition at a longer SOA was observed for the auditory task. Taken together these results suggest that location-based auditory IOR is not dependent on either eye movements or saccade programming to sound locations.  相似文献   

13.
In this study, we show that the contingent auditory motion aftereffect is strongly influenced by visual motion information. During an induction phase, participants listened to rightward-moving sounds with falling pitch alternated with leftward-moving sounds with rising pitch (or vice versa). Auditory aftereffects (i.e., a shift in the psychometric function for unimodal auditory motion perception) were bigger when a visual stimulus moved in the same direction as the sound than when no visual stimulus was presented. When the visual stimulus moved in the opposite direction, aftereffects were reversed and thus became contingent upon visual motion. When visual motion was combined with a stationary sound, no aftereffect was observed. These findings indicate that there are strong perceptual links between the visual and auditory motion-processing systems.  相似文献   

14.
Many studies on multisensory processes have focused on performance in simplified experimental situations, with a single stimulus in each sensory modality. However, these results cannot necessarily be applied to explain our perceptual behavior in natural scenes where various signals exist within one sensory modality. We investigated the role of audio-visual syllable congruency on participants' auditory localization bias or the ventriloquism effect using spoken utterances and two videos of a talking face. Salience of facial movements was also manipulated. Results indicated that more salient visual utterances attracted participants' auditory localization. Congruent pairing of audio-visual utterances elicited greater localization bias than incongruent pairing, while previous studies have reported little dependency on the reality of stimuli in ventriloquism. Moreover, audio-visual illusory congruency, owing to the McGurk effect, caused substantial visual interference on auditory localization. Multisensory performance appears more flexible and adaptive in this complex environment than in previous studies.  相似文献   

15.
Observers are able to segment continuous everyday activity into meaningful parts. This ability may be related to processing low-level visual cues, such as changes in motion. To address this issue, the present study combined measurement of evoked responses to event boundaries with functional identification of the extrastriate motion complex (MT+) and the frontal eye field (FEF), two regions related to motion perception and eye movements. The results provided strong evidence that MT+ is activated by event boundaries: Individuals' MT+ regions showed strong responses to event boundaries, and MT+ was collocated with a lateral posterior region that responded at event boundaries. The evidence regarding the FEF was less conclusive: The FEF showed reliable but relatively reduced responses to event boundaries, but the FEF was medial and superior to a frontal area that responded at event boundaries. These results suggest that motion cues, and possibly eye movements, may play key roles in event structure perception.  相似文献   

16.
We examined whether facial expressions of performers influence the emotional connotations of sung materials, and whether attention is implicated in audio-visual integration of affective cues. In Experiment 1, participants judged the emotional valence of audio-visual presentations of sung intervals. Performances were edited such that auditory and visual information conveyed congruent or incongruent affective connotations. In the single-task condition, participants judged the emotional connotation of sung intervals. In the dual-task condition, participants judged the emotional connotation of intervals while performing a secondary task. Judgements were influenced by melodic cues and facial expressions and the effects were undiminished by the secondary task. Experiment 2 involved identical conditions but participants were instructed to base judgements on auditory information alone. Again, facial expressions influenced judgements and the effect was undiminished by the secondary task. The results suggest that visual aspects of music performance are automatically and preattentively registered and integrated with auditory cues.  相似文献   

17.
Previous work has found that repetitive auditory stimulation (click trains) increases the subjective velocity of subsequently presented moving stimuli. We ask whether the effect of click trains is stronger for retinal velocity signals (produced when the target moves across the retina) or for extraretinal velocity signals (produced during smooth pursuit eye movements, when target motion across the retina is limited). In Experiment 1, participants viewed leftward or rightward moving single dot targets, travelling at speeds from 7.5 to 17.5 deg/s. They estimated velocity at the end of each trial. Prior presentation of auditory click trains increased estimated velocity, but only in the pursuit condition, where estimates were based on extraretinal velocity signals. Experiment 2 generalized this result to vertical motion. Experiment 3 found that the effect of clicks during pursuit disappeared when participants tracked across a visually textured background that provided strong local motion cues. Together these results suggest that auditory click trains selectively affect extraretinal velocity signals. This novel finding suggests that the cross-modal integration required for auditory click trains to influence subjective velocity operates at later stages of processing.  相似文献   

18.
Lip reading is the ability to partially understand speech by looking at the speaker's lips. It improves the intelligibility of speech in noise when audio-visual perception is compared with audio-only perception. A recent set of experiments showed that seeing the speaker's lips also enhances sensitivity to acoustic information, decreasing the auditory detection threshold of speech embedded in noise [J. Acoust. Soc. Am. 109 (2001) 2272; J. Acoust. Soc. Am. 108 (2000) 1197]. However, detection is different from comprehension, and it remains to be seen whether improved sensitivity also results in an intelligibility gain in audio-visual speech perception. In this work, we use an original paradigm to show that seeing the speaker's lips enables the listener to hear better and hence to understand better. The audio-visual stimuli used here could not be differentiated by lip reading per se since they contained exactly the same lip gesture matched with different compatible speech sounds. Nevertheless, the noise-masked stimuli were more intelligible in the audio-visual condition than in the audio-only condition due to the contribution of visual information to the extraction of acoustic cues. Replacing the lip gesture by a non-speech visual input with exactly the same time course, providing the same temporal cues for extraction, removed the intelligibility benefit. This early contribution to audio-visual speech identification is discussed in relationships with recent neurophysiological data on audio-visual perception.  相似文献   

19.
Speech prosody has traditionally been considered solely in terms of its auditory features, yet correlated visual features exist, such as head and eyebrow movements. This study investigated the extent to which visual prosodic features are able to affect the perception of the auditory features. Participants were presented with videos of a speaker pronouncing two words, with visual features of emphasis on one of these words. For each trial, participants saw one video where the two words were identical in both pitch and amplitude, and another video where there was a difference in either pitch or amplitude that was congruent or incongruent with the visual changes. Participants were asked to decide which video contained the sound difference. Thresholds were obtained for the congruent and incongruent videos, and for an auditory-alone condition. It was found that the congruent thresholds were better than the incongruent thresholds for both pitch and amplitude changes. Interestingly, the congruent thresholds for amplitude were better than for the auditory-alone condition, which implies that the visual features improve sensitivity to loudness changes. These results demonstrate that visual stimuli can affect auditory thresholds for changes in pitch and amplitude, and furthermore support the view that visual prosodic features enhance speech processing.  相似文献   

20.
Schutz M  Lipscomb S 《Perception》2007,36(6):888-897
Percussionists inadvertently use visual information to strategically manipulate audience perception of note duration. Videos of long (L) and short (S) notes performed by a world-renowned percussionist were separated into visual (Lv, Sv) and auditory (La, Sa) components. Visual components contained only the gesture used to perform the note, auditory components the acoustic note itself. Audio and visual components were then crossed to create realistic musical stimuli. Participants were informed of the mismatch, and asked to rate note duration of these audio-visual pairs based on sound alone. Ratings varied based on visual (Lv versus Sv), but not auditory (La versus Sa) components. Therefore while longer gestures do not make longer notes, longer gestures make longer sounding notes through the integration of sensory information. This finding contradicts previous research showing that audition dominates temporal tasks such as duration judgment.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号