首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We present a computational framework for attention-guided visual scene exploration in sequences of RGB-D data. For this, we propose a visual object candidate generation method to produce object hypotheses about the objects in the scene. An attention system is used to prioritise the processing of visual information by (1) localising candidate objects, and (2) integrating an inhibition of return (IOR) mechanism grounded in spatial coordinates. This spatial IOR mechanism naturally copes with camera motions and inhibits objects that have already been the target of attention. Our approach provides object candidates which can be processed by higher cognitive modules such as object recognition. Since objects are basic elements for many higher level tasks, our architecture can be used as a first layer in any cognitive system that aims at interpreting a stream of images. We show in the evaluation how our framework finds most of the objects in challenging real-world scenes.  相似文献   

2.
任衍具  孙琪 《心理学报》2014,46(11):1613-1627
采用视空工作记忆任务和真实场景搜索任务相结合的双任务范式, 结合眼动技术将搜索过程划分为起始阶段、扫描阶段和确认阶段, 探究视空工作记忆负载对真实场景搜索绩效的影响机制, 同时考查试次间搜索目标是否变化、目标模板的具体化程度以及搜索场景画面的视觉混乱度所起的调节作用。结果表明, 视空工作记忆负载会降低真实场景搜索的成绩, 在搜索过程中表现为视空负载条件下扫描阶段持续时间的延长、注视点数目的增加和空间负载条件下确认阶段持续时间的延长, 视空负载对搜索过程的影响与目标模板的具体化程度有关; 空间负载会降低真实场景搜索的效率, 且与搜索画面的视觉混乱度有关, 而客体负载则不会。由此可见, 视空工作记忆负载对真实场景搜索绩效的影响不同, 空间负载对搜索过程的影响比客体负载更长久, 二者均受到目标模板具体化程度的调节; 仅空间负载会降低真实场景的搜索效率, 且受到搜索场景画面视觉混乱度的调节。  相似文献   

3.
DRC: a dual route cascaded model of visual word recognition and reading aloud   总被引:55,自引:0,他引:55  
This article describes the Dual Route Cascaded (DRC) model, a computational model of visual word recognition and reading aloud. The DRC is a computational realization of the dual-route theory of reading, and is the only computational model of reading that can perform the 2 tasks most commonly used to study reading: lexical decision and reading aloud. For both tasks, the authors show that a wide variety of variables that influence human latencies influence the DRC model's latencies in exactly the same way. The DRC model simulates a number of such effects that other computational models of reading do not, but there appear to be no effects that any other current computational model of reading can simulate but that the DRC model cannot. The authors conclude that the DRC model is the most successful of the existing computational models of reading.  相似文献   

4.
Multiple factors have been proposed to contribute to the other-race effect in face recognition, including perceptual expertise and social-cognitive accounts. Here, we propose to understand the effect and its contributing factors from the perspectives of learning mechanisms that involve joint learning of visual attention strategies and internal representations for faces, which can be modulated by quality of contact with other-race individuals including emotional and motivational factors. Computational simulations of this process will enhance our understanding of interactions among factors and help resolve inconsistent results in the literature. In particular, since learning is driven by task demands, visual attention effects observed in different face-processing tasks, such as passive viewing or recognition, are likely to be task specific (although may be associated) and should be examined and compared separately. When examining visual attention strategies, the use of more data-driven and comprehensive eye movement measures, taking both spatial–temporal pattern and consistency of eye movements into account, can lead to novel discoveries in other-race face processing. The proposed framework and analysis methods may be applied to other tasks of real-life significance such as face emotion recognition, further enhancing our understanding of the relationship between learning and visual cognition.  相似文献   

5.
How do we learn to recognize visual categories, such as dogs and cats? Somehow, the brain uses limited variable examples to extract the essential characteristics of new visual categories. Here, I describe an approach to category learning and recognition that is based on recent computational advances. In this approach, objects are represented by a hierarchy of fragments that are extracted during learning from observed examples. The fragments are class-specific features and are selected to deliver a high amount of information for categorization. The same fragments hierarchy is then used for general categorization, individual object recognition and object-parts identification. Recognition is also combined with object segmentation, using stored fragments, to provide a top-down process that delineates object boundaries in complex cluttered scenes. The approach is computationally effective and provides a possible framework for categorization, recognition and segmentation in human vision.  相似文献   

6.
Face recognition is a computationally challenging classification task. Deep convolutional neural networks (DCNNs) are brain-inspired algorithms that have recently reached human-level performance in face and object recognition. However, it is not clear to what extent DCNNs generate a human-like representation of face identity. We have recently revealed a subset of facial features that are used by humans for face recognition. This enables us now to ask whether DCNNs rely on the same facial information and whether this human-like representation depends on a system that is optimized for face identification. In the current study, we examined the representation of DCNNs of faces that differ in features that are critical or non-critical for human face recognition. Our findings show that DCNNs optimized for face identification are tuned to the same facial features used by humans for face recognition. Sensitivity to these features was highly correlated with performance of the DCNN on a benchmark face recognition task. Moreover, sensitivity to these features and a view-invariant face representation emerged at higher layers of a DCNN optimized for face recognition but not for object recognition. This finding parallels the division to a face and an object system in high-level visual cortex. Taken together, these findings validate human perceptual models of face recognition, enable us to use DCNNs to test predictions about human face and object recognition as well as contribute to the interpretability of DCNNs.  相似文献   

7.
大脑的知觉加工并非单纯由外部刺激驱动,而是存在自上而下的知觉调控。尽管这一现象被大量实验研究证实,但其神经机制仍然是认知神经科学研究的重要问题。本研究系统介绍了知觉调控的神经基础、实现形式、研究范式,及其理论模型,分析指出了当前研究面临的主要问题,并对未来的研究进行了展望,以期促进该问题研究的进一步开展。  相似文献   

8.
9.
Aligning pictorial descriptions: an approach to object recognition   总被引:12,自引:0,他引:12  
S Ullman 《Cognition》1989,32(3):193-254
  相似文献   

10.
Computational theories of vision typically rely on the analysis of two aspects of human visual function: (1) object and shape recognition (2) co-calibration of sensory measurements. Both these approaches are usually based on an inverse-optics model, where visual perception is viewed as a process of inference from a 2D retinal projection to a 3D percept within a Euclidean space schema. This paradigm has had great success in certain areas of vision science, but has been relatively less successful in understanding perceptual representation, namely, the nature of the perceptual encoding. One of the drawbacks of inverse-optics approaches has been the difficulty in defining the constraints needed to make the inference computationally tractable (e.g. regularity assumptions, Bayesian priors, etc.). These constraints, thought to be learned assumptions about the nature of the physical and optical structures of the external world, have to be incorporated into any workable computational model in the inverse-optics paradigm. But inference models that employ an inverse optics plus structural assumptions approach inevitably result in a naïve realist theory of perceptual representation. Another drawback of inference models for theories of perceptual representation is their inability to explain central features of the visual experience. The one most evident in the process and visual understanding of design is the fact that some visual configurations appear, often spontaneously, as perceptually more coherent than others. The epistemological consequences of inferential approaches to vision indicate that they fail to capture enduring aspects of our visual experience. Therefore they may not be suited to a theory of perceptual representation, or useful for an understanding of the role of perception in the design process and product.  相似文献   

11.
n-back任务下视觉工作记忆负荷研究   总被引:3,自引:0,他引:3  
实验采用n-back范式,考察了视觉工作记忆负荷与反应时、辨别力和主观评价的关系,以及空间位置记忆负荷与图形记忆负荷的差异。结果发现,随着记忆负荷的增大,反应时延长,辨别力降低,主观负荷提高。视觉图形记忆任务的负荷水平高于视觉空间位置记忆任务的负荷水平;同时也为视觉图形记忆和视觉空间记忆具有不同的加工过程提供了证据。  相似文献   

12.
Lucia M. Vaina 《Synthese》1990,83(1):49-91
In this paper we focus on the modularity of visual functions in the human visual cortex, that is, the specific problems that the visual system must solve in order to achieve recognition of objects and visual space. The computational theory of early visual functions is briefly reviewed and is then used as a basis for suggesting computational constraints on the higher-level visual computations. The remainder of the paper presents neurological evidence for the existence of two visual systems in man, one specialized for spatial vision and the other for object vision. We show further clinical evidence for the computational hypothesis that these two systems consist of several visual modules, some of which can be isolated on the basis of specific visual deficits which occur after lesions to selected areas in the visually responsive brain. We will provide examples of visual modules which solve information processing tasks that are mediated by specific anatomic areas. We will show that the clinical data from behavioral studies of monkeys (Ungerleider and Mishkin 1984) supports the distinction between two visual systems in monkeys, the what system, involved in object vision, and the where system, involved in spatial vision.I thank Carole Graybill for editorial help.  相似文献   

13.
A successful vision system must solve the problem of deriving geometrical information about three-dimensional objects from two-dimensional photometric input. The human visual system solves this problem with remarkable efficiency, and one challenge in vision research is to understand how neural representations of objects are formed and what visual information is used to form these representations. Ideal observer analysis has demonstrated the advantages of studying vision from the perspective of explicit generative models and a specified visual task, which divides the causes of image variations into the separate categories of signal and noise. Classification image techniques estimate the visual information used in a task from the properties of “noise” images that interact most strongly with the task. Both ideal observer analysis and classification image techniques rely on the assumption of a generative model. We show here how the ability of the classification image approach to understand how an observer uses visual information can be improved by matching the type and dimensionality of the model to that of the neural representation or internal template being studied. Because image variation in real world object tasks can arise from both geometrical shape and photometric (illumination or material) changes, a realistic image generation process should model geometry as well as intensity. A simple example is used to demonstrate what we refer to as a “classification object” approach to studying three-dimensional object representations.  相似文献   

14.
Human beings can effortlessly perceive stimuli through their sensory systems to learn, understand, recognize and act on our environment or context. Over the years, efforts have been made to enable cybernetic entities to be close to performing human perception tasks; and in general, to bring artificial intelligence closer to human intelligence.Neuroscience and other cognitive sciences provide evidence and explanations of the functioning of certain aspects of visual perception in the human brain. Visual perception is a complex process, and its has been divided into several parts. Object classification is one of those parts; it is necessary for carrying out the declarative interpretation of the environment. This article deals with the object classification problem.In this article, we propose a computational model of visual classification of objects based on neuroscience, it consists of two modular systems: a visual processing system, in charge of the extraction of characteristics; and a perception sub-system, which performs the classification of objects based on the features extracted by the visual processing system.With the results obtained, a set of aspects are analyzed using similarity and dissimilarity matrices. Also based on the neuroscientific evidence and the results obtained from this research, some aspects are suggested for consideration to improve the work in the future and bring us closer to performing the task of visual classification as humans do.  相似文献   

15.
视觉工作记忆在视觉搜索中的作用   总被引:1,自引:0,他引:1  
视觉工作记忆在视觉搜索中的作用远比偏向竞争模型所描述的要复杂,个体能根据当前任务要求灵活地利用工作记忆来引导注意选择。在系统回顾已有研究的基础上,从客体工作记忆、空间工作记忆和执行工作记忆3个方面探讨了视觉工作记忆在视觉搜索中的作用。根据近期的研究,文章在最后对视觉工作记忆在视觉搜索中的作用做了7点总结,并对已往研究中存在的若干问题给予了解释,指出未来应该从前瞻记忆和内隐记忆等角度对视觉搜索中所涉及的视觉记忆进行深入研究  相似文献   

16.
ABSTRACT— Visual object recognition is foundational to processes of categorization, tool use, and real-world problem solving. Despite considerable effort across many disciplines and many specific advances, there is no comprehensive or well-accepted account of this ability. Moreover, none of the extant approaches consider how human object recognition develops. New evidence indicates a period of rapid change in toddlers' visual object recognition between 18 and 24 months that is related to the learning of object names and to goal-directed action. Children appear to shift from recognition based on piecemeal fragments to recognition based on geometric representations of three-dimensional shape. These findings may lead to a more unified understanding of the processes that make human object recognition as impressive as it is.  相似文献   

17.
Planning and decision-making are two of the cognitive functions involved in the solution of problems. These functions, among others, have been studied from the point of view of a new field known as cognitive informatics focused on the development of cognitive architectures, autonomous agents, and human robots that are capable of showing human-like behavior. We present an exhaustive study of current biological and computational models proposed in the fields of neuroscience, psychology, and cognitive informatics. Also, we present a deep review of the brain areas involved in planning, decision-making, and affection. However, the majority of the proposed computational models are seeking to mimic human external behavior. This paper aims to contribute to the cognitive informatics field with an innovative cognitive computational model of planning and decision-making. The two main differences of our model with respect to the current models in the literature are: (i) our model considers affective and motivational information as a basic and essential trigger in planning and decision-making processes; (ii) our model attempts to mimic both the internal human brain as well as the external human behavior. We developed a computational model capable of offering a direct mapping from human brain areas to computational modules of our model. Thus, in this paper we present our model from a conceptual, formal, and computational approach in order to show how our proposal must be implemented. Finally, a set of tests were conducted in order to validate our proposal. These tests show an interesting comparison between the behavior of our prototype and the behavior exhibited by some people involved in a case study.  相似文献   

18.
To survive on today's highways, a driver must have highly developed skills in visually guided collision avoidance. To play such games as cricket, tennis or baseball demands accurate, precise and reliable collision achievement. This review discusses evidence that some of these tasks are performed by predicting where an object will be at some sharply defined instant, several hundred milliseconds in the future, while other tasks are performed by utilizing the fact that some of our motor actions change what we see in ways that obey lawful relationships, and can therefore be learned. Several monocular and binocular visual correlates of the direction of an object's motion relative to the observer's head have been derived theoretically, along with visual correlates of the time to collision with an approaching object. Although laboratory psychophysics can identify putative neural mechanisms by showing which of the known correlates are processed by the human visual system independently of other visual information, it is only field research on, for example, driving, aviation and sport that can show which visual cues are actually used in these activities. This article reviews this research and describes a general psychophysically based rational approach to the design of such field studies.  相似文献   

19.
This article presents a theory of visual word recognition that assumes that, in the tasks of word identification, lexical decision, and semantic categorization, human readers behave as optimal Bayesian decision makers. This leads to the development of a computational model of word recognition, the Bayesian reader. The Bayesian reader successfully simulates some of the most significant data on human reading. The model accounts for the nature of the function relating word frequency to reaction time and identification threshold, the effects of neighborhood density and its interaction with frequency, and the variation in the pattern of neighborhood density effects seen in different experimental tasks. Both the general behavior of the model and the way the model predicts different patterns of results in different tasks follow entirely from the assumption that human readers approximate optimal Bayesian decision makers.  相似文献   

20.
Both humans and non‐human animals exhibit sensitivity to the approximate number of items in a visual array, as indexed by their performance in numerosity discrimination tasks, and even neonates can detect changes in numerosity. These findings are often interpreted as evidence for an innate ‘number sense’. However, recent simulation work has challenged this view by showing that human‐like sensitivity to numerosity can emerge in deep neural networks that build an internal model of the sensory data. This emergentist perspective posits a central role for experience in shaping our number sense and might explain why numerical acuity progressively increases over the course of development. Here we substantiate this hypothesis by introducing a progressive unsupervised deep learning algorithm, which allows us to model the development of numerical acuity through experience. We also investigate how the statistical distribution of numerical and non‐numerical features in natural environments affects the emergence of numerosity representations in the computational model. Our simulations show that deep networks can exhibit numerosity sensitivity prior to any training, as well as a progressive developmental refinement that is modulated by the statistical structure of the learning environment. To validate our simulations, we offer a refinement to the quantitative characterization of the developmental patterns observed in human children. Overall, our findings suggest that it may not be necessary to assume that animals are endowed with a dedicated system for processing numerosity, since domain‐general learning mechanisms can capture key characteristics others have attributed to an evolutionarily specialized number system.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号