首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
日常生活中, 语言的使用往往出现在某个视觉情境里。大量认知科学研究表明, 视觉信息与语言信息加工模块并不是独立工作, 而是存在复杂的交互作用。本文以视觉信息对语言加工的影响为主线, 首先对视觉信息影响言语理解, 言语产生以及言语交流的相关研究进展进行了综述。其次, 重点对视觉信息影响语言加工的机制进行了探讨。最后介绍了关于视觉信息影响语言加工的计算模型, 并对未来的研究方向提出了展望。  相似文献   

3.
Childrens' visual attention to, and comprehension of, a television program was measured as a fucntion of inserts called preplays that varied on two orthogoal dimensions: (1) Presence or absence of visual excerpts from the program, and (2) concrete or inferential story narration. Visual fixation was coded continously for 64 pairs of same-sex children, in 1st 4th grades, while they viewed the television program with one of four types of prepalys. After veiwing, each child answered items assessing his or her comprehension of the visual and verbally presented content. Children who viewed visual preplays attended longer than did children who viewed nonvisual preplays. Visual presentation predicted comprehension of content presented in a visual mode, whereas inferential narration predicted comprehension of implicit content presented in a verbal mode. The results suggest that information processing is modality specific: Visual presentation affects visual processing and abstract language affects verbal processing. The results do not suport the hypothesis that visual presentation interfere with linguistic processing.  相似文献   

4.
Individual differences in children's online language processing were explored by monitoring their eye movements to objects in a visual scene as they listened to spoken sentences. Eleven skilled and 11 less-skilled comprehenders were presented with sentences containing verbs that were either neutral with respect to the visual context (e.g., Jane watched her mother choose the cake, where all of the objects in the scene were choosable) or supportive (e.g., Jane watched her mother eat the cake, where the cake was the only edible object). On hearing the supportive verb, the children made fast anticipatory eye movements to the target object (e.g., the cake), suggesting that children extract information from the language they hear and use this to direct ongoing processing. Less-skilled comprehenders did not differ from controls in the speed of their anticipatory eye movements, suggesting normal sensitivity to linguistic constraints. However, less-skilled comprehenders made a greater number of fixations to target objects, and these fixations were of a duration shorter than those observed in the skilled comprehenders, especially in the supportive condition. This pattern of results is discussed in terms of possible processing limitations, including difficulties with memory, attention, or suppressing irrelevant information.  相似文献   

5.
Two experiments with 5- and 7-year-old children tested the hypotheses that auditory attention is used to (a) monitor a TV program for important visual content, and (b) semantically process program information through language to enhance comprehension and visual attention. A direct measure of auditory attention was the latency of the child's restoration of gradually degraded sound quality. Restoration of auditory clarity did not vary as a function of looking. Restoration of visual clarity was faster when looking than when not looking. Restoration was faster for visual than auditory degrades, but audiovisual degrades were restored most rapidly of all, suggesting that dual modality presentation maximizes children's attention. Narration enhanced visual attention and comprehension including comprehension of visually presented material. Auditory comprehension did not depend on looking, suggesting that children can semantically process verbal content without looking at the TV. Auditory attention did not differ with the presence or absence of narration, but did predict auditory comprehension best while visual attention predicted visual comprehension best. In the absence of narration, auditory attention predicted visual comprehension, suggesting its monitoring function. Visual attention indexed overall interest and appeared to be most critical for comprehension in the absence of narration.  相似文献   

6.
7.
We compare the processing of transitive sentences in young learners of a strict word order language (English) and two languages that allow noun omissions and many variant word orders: Turkish, a case-marked language, and Mandarin Chinese, a non case-marked language. Children aged 1–3 years listened to simple transitive sentences in the typical word order of their language, paired with two visual scenes, only one of which matched the sentence. Multiple measures of comprehension (percent of looking to match, latency to look to match, number of switches of attention) revealed a general pattern of early sensitivity to word order, coupled with language and age effects in children's processing efficiency. In particular, English learners showed temporally speedier processing of transitive sentences than Turkish learners, who also displayed more uncertainty about the matching scene. Mandarin learners behaved like Turkish learners in showing slower processing of sentences, and all language groups displayed faster processing by older than younger children. These results demonstrate that sentence processing is sensitive to crosslinguistic features beginning early in language development.  相似文献   

8.
场景主旨是指观察者在一次注视场景的过程中所获得知觉和语义信息。近年来, 场景主旨加工研究已经成为视知觉领域的重要内容, 对该问题的研究将有助于揭示视觉信息加工的机制, 对智能机器视觉的研制也有一定的借鉴意义。对场景主旨加工的影响因素、争议性的问题以及场景主旨的神经基础进行评论; 未来可以在场景主旨加工的基本单元、相关的理论解释、层级加工的调节因素、注意的调节作用、时间动力特性和脑功能网络的构建等方面做进一步的探讨。  相似文献   

9.
Visual information conveyed by iconic hand gestures and visible speech can enhance speech comprehension under adverse listening conditions for both native and non‐native listeners. However, how a listener allocates visual attention to these articulators during speech comprehension is unknown. We used eye‐tracking to investigate whether and how native and highly proficient non‐native listeners of Dutch allocated overt eye gaze to visible speech and gestures during clear and degraded speech comprehension. Participants watched video clips of an actress uttering a clear or degraded (6‐band noise‐vocoded) action verb while performing a gesture or not, and were asked to indicate the word they heard in a cued‐recall task. Gestural enhancement was the largest (i.e., a relative reduction in reaction time cost) when speech was degraded for all listeners, but it was stronger for native listeners. Both native and non‐native listeners mostly gazed at the face during comprehension, but non‐native listeners gazed more often at gestures than native listeners. However, only native but not non‐native listeners' gaze allocation to gestures predicted gestural benefit during degraded speech comprehension. We conclude that non‐native listeners might gaze at gesture more as it might be more challenging for non‐native listeners to resolve the degraded auditory cues and couple those cues to phonological information that is conveyed by visible speech. This diminished phonological knowledge might hinder the use of semantic information that is conveyed by gestures for non‐native compared to native listeners. Our results demonstrate that the degree of language experience impacts overt visual attention to visual articulators, resulting in different visual benefits for native versus non‐native listeners.  相似文献   

10.
Conversation is supported by the beliefs that people have in common and the perceptual experience that they share. The visual context of a conversation has two aspects: the information that is available to each conversant, and their beliefs about what is present for each other. In our experiment, we separated these factors for the first time and examined their impact on a spontaneous conversation. We manipulated the fact that a visual scene was shared or not and the belief that a visual scene was shared or not. Participants watched videos of actors talking about a controversial topic, then discussed their own views while looking at either a blank screen or the actors. Each believed (correctly or not) that their partner was either looking at a blank screen or the same images. We recorded conversants' eye movements, quantified how they were coordinated, and analyzed their speech patterns. Gaze coordination has been shown to be causally related to the knowledge people share before a conversation, and the information they later recall. Here, we found that both the presence of the visual scene, and beliefs about its presence for another, influenced language use and gaze coordination.  相似文献   

11.
This study investigated processing effort by measuring peoples’ pupil diameter as they listened to sentences containing a temporary syntactic ambiguity. In the first experiment, we manipulated prosody. The results showed that when prosodic structure conflicted with syntactic structure, pupil diameter reliably increased. In the second experiment, we manipulated both prosody and visual context. The results showed that when visual context was consistent with the correct interpretation, prosody had very little effect on processing effort. However, when visual context was inconsistent with the correct interpretation, prosody had a large effect on processing effort. The interaction between visual context and prosody shows that visual context has an effect on online processing and that it can modulate the influence of linguistic sources of information, such as prosody. Pupillometry is a sensitive measure of processing effort during spoken language comprehension.  相似文献   

12.
Executive working memory load induces inattentional blindness   总被引:1,自引:0,他引:1  
When attention is engaged in a task, unexpected events in the visual scene may go undetected, a phenomenon known as inattentional blindness (IB). At what stage of information processing must attention be engaged for IB to occur? Although manipulations that tax visuospatial attention can induce IB, the evidence is more equivocal for tasks that engage attention at late, central stages of information processing. Here, we tested whether IB can be specifically induced by central executive processes. An unexpected visual stimulus was presented during the retention interval of a working memory task that involved either simply maintaining verbal material or rearranging the material into alphabetical order. The unexpected stimulus was more likely to be missed during manipulation than during simple maintenance of the verbal information. Thus, the engagement of executive processes impairs the ability to detect unexpected, task-irrelevant stimuli, suggesting that IB can result from central, amodal stages of processing.  相似文献   

13.
Two studies investigated the interaction between utterance and scene processing by monitoring eye movements in agent–action–patient events, while participants listened to related utterances. The aim of Experiment 1 was to determine if and when depicted events are used for thematic role assignment and structural disambiguation of temporarily ambiguous English sentences. Shortly after the verb identified relevant depicted actions, eye movements in the event scenes revealed disambiguation. Experiment 2 investigated the relative importance of linguistic/world knowledge and scene information. When the verb identified either only the stereotypical agent of a (nondepicted) action, or the (nonstereotypical) agent of a depicted action as relevant, verb-based thematic knowledge and depicted action each rapidly influenced comprehension. In contrast, when the verb identified both of these agents as relevant, the gaze pattern suggested a preferred reliance of comprehension on depicted events over stereotypical thematic knowledge for thematic interpretation. We relate our findings to language comprehension and acquisition theories.  相似文献   

14.
This research aimed at studying the role of subtitling in film comprehension. It focused on the languages in which the subtitles are written and on the participants' fluency levels in the languages presented in the film. In a preliminary part of the study, the most salient visual and dialogue elements of a short sequence of an English film were extracted by the means of a free recall task after showing two versions of the film (first a silent, then a dubbed-into-French version) to native French speakers. This visual and dialogue information was used in the setting of a questionnaire concerning the understanding of the film presented in the main part of the study, in which other French native speakers with beginner, intermediate, or advanced fluency levels in English were shown one of three versions of the film used in the preliminary part. Respectively, these versions had no subtitles or they included either English or French subtitles. The results indicate a global interaction between all three factors in this study: For the beginners, visual processing dropped from the version without subtitles to that with English subtitles, and even more so if French subtitles were provided, whereas the effect of film version on dialogue comprehension was the reverse. The advanced participants achieved higher comprehension for both types of information with the version without subtitles, and dialogue information processing was always better than visual information processing. The intermediate group similarly processed dialogues in a better way than visual information, but was not affected by film version. These results imply that, depending on the viewers' fluency levels, the language of subtitles can have different effects on movie information processing.  相似文献   

15.
Heim S 《Brain and language》2008,106(1):55-64
Despite the increasing number of neuroimaging studies of syntactic gender processing no model is currently available that includes data from visual and auditory language comprehension and language production. This paper provides a systematic review of the neural correlates of syntactic gender processing. Based on anatomical information from cytoarchitectonic probability maps it is argued that the left BA 44 plays a central role for the active use of gender information, e.g., for explicit decisions as well as for subsequent morphological encoding. The left BA 45 is involved in the strategic generation of morphological cues that facilitate gender processing. Model implications for aphasic patients with lesions including or excluding parts of Broca's speech region are discussed.  相似文献   

16.
Staudte M  Crocker MW 《Cognition》2011,(2):268-291
Referential gaze during situated language production and comprehension is tightly coupled with the unfolding speech stream ( [Griffin, 2001] , [Meyer et al., 1998] and [Tanenhaus et al., 1995] ). In a shared environment, utterance comprehension may further be facilitated when the listener can exploit the speaker’s focus of (visual) attention to anticipate, ground, and disambiguate spoken references. To investigate the dynamics of such gaze-following and its influence on utterance comprehension in a controlled manner, we use a human–robot interaction setting. Specifically, we hypothesize that referential gaze is interpreted as a cue to the speaker’s referential intentions which facilitates or disrupts reference resolution. Moreover, the use of a dynamic and yet extremely controlled gaze cue enables us to shed light on the simultaneous and incremental integration of the unfolding speech and gaze movement.We report evidence from two eye-tracking experiments in which participants saw videos of a robot looking at and describing objects in a scene. The results reveal a quantified benefit-disruption spectrum of gaze on utterance comprehension and, further, show that gaze is used, even during the initial movement phase, to restrict the spatial domain of potential referents. These findings more broadly suggest that people treat artificial agents similar to human agents and, thus, validate such a setting for further explorations of joint attention mechanisms.  相似文献   

17.
18.
Previous research has shown that listeners follow speaker gaze to mentioned objects in a shared environment to ground referring expressions, both for human and robot speakers. What is less clear is whether the benefit of speaker gaze is due to the inference of referential intentions (Staudte and Crocker, 2011) or simply the (reflexive) shifts in visual attention. That is, is gaze special in how it affects simultaneous utterance comprehension? In four eye-tracking studies we directly contrast speech-aligned speaker gaze of a virtual agent with a non-gaze visual cue (arrow). Our findings show that both cues similarly direct listeners’ attention and that listeners can benefit in utterance comprehension from both cues. Only when they are similarly precise, however, does this equality extend to incongruent cueing sequences: that is, even when the cue sequence does not match the concurrent sequence of spoken referents can listeners benefit from gaze as well as arrows. The results suggest that listeners are able to learn a counter-predictive mapping of both cues to the sequence of referents. Thus, gaze and arrows can in principle be applied with equal flexibility and efficiency during language comprehension.  相似文献   

19.
An ongoing issue in visual cognition concerns the roles played by low- and high-level information in guiding visual attention, with current research remaining inconclusive about the interaction between the two. In this study, we bring fresh evidence into this long-standing debate by investigating visual saliency and contextual congruency during object naming (Experiment 1), a task in which visual processing interacts with language processing. We then compare the results of this experiment to data of a memorization task using the same stimuli (Experiment 2). In Experiment 1, we find that both saliency and congruency influence visual and naming responses and interact with linguistic factors. In particular, incongruent objects are fixated later and less often than congruent ones. However, saliency is a significant predictor of object naming, with salient objects being named earlier in a trial. Furthermore, the saliency and congruency of a named object interact with the lexical frequency of the associated word and mediate the time-course of fixations at naming. In Experiment 2, we find a similar overall pattern in the eye-movement responses, but only the congruency of the target is a significant predictor, with incongruent targets fixated less often than congruent targets. Crucially, this finding contrasts with claims in the literature that incongruent objects are more informative than congruent objects by deviating from scene context and hence need a longer processing. Overall, this study suggests that different sources of information are interactively used to guide visual attention on the targets to be named and raises new questions for existing theories of visual attention.  相似文献   

20.
We present a computational framework for attention-guided visual scene exploration in sequences of RGB-D data. For this, we propose a visual object candidate generation method to produce object hypotheses about the objects in the scene. An attention system is used to prioritise the processing of visual information by (1) localising candidate objects, and (2) integrating an inhibition of return (IOR) mechanism grounded in spatial coordinates. This spatial IOR mechanism naturally copes with camera motions and inhibits objects that have already been the target of attention. Our approach provides object candidates which can be processed by higher cognitive modules such as object recognition. Since objects are basic elements for many higher level tasks, our architecture can be used as a first layer in any cognitive system that aims at interpreting a stream of images. We show in the evaluation how our framework finds most of the objects in challenging real-world scenes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号