In the present experiment, participants were exploring line drawings of scenes in the context of an object-decision task, while eye-contingent display changes manipulated the appearance of the foveal part of the image. Foveal information was replaced by an ovoid noise mask for 83 ms, after a preset delay of 15, 35, 60, or 85 ms following the onset of fixations. In control conditions, a red ellipse appeared for 83 ms, centered around the fixation position, after the same delays as in the noise-mask conditions. It was found that scene exploration was hampered especially when foveal masking occurred early during fixations, replicating earlier findings. Furthermore, fixation durations were shown to increase linearly as the mask delay decreased, which validates the fixation duration as a measure of perceptual processing speed.  相似文献   

Human gaze control during real-world scene perception   总被引:17,自引:0,他引:17  
In human vision, acuity and color sensitivity are best at the point of fixation, and the visual-cognitive system exploits this fact by actively controlling gaze to direct fixation towards important and informative scene regions in real time as needed. How gaze control operates over complex real-world scenes has recently become of central concern in several core cognitive science disciplines including cognitive psychology, visual neuroscience, and machine vision. This article reviews current approaches and empirical findings in human gaze control during real-world scene perception.  相似文献   

Global transsaccadic change blindness during scene perception   总被引:1,自引:0,他引:1  
Each time the eyes are spatially reoriented via a saccadic eye movement, the image falling on the retina changes. How visually specific are the representations that are functional across saccades during active scene perception? This question was investigated with a saccade-contingent display-change paradigm in which pictures of complex real-world scenes were globally changed in real time during eye movements. The global changes were effected by presenting each scene as an alternating set of scene strips and occluding gray bars, and by reversing the strips and bars during specific saccades. The results from two experiments demonstrated a global transsaccadic change-blindness effect, suggesting that point-by-point visual representations are not functional across saccades during complex scene perception.  相似文献   

Gordon RD 《Memory & cognition》2006,34(7):1484-1494
In two experiments, we examined the role of semantic scene content in guiding attention during scene viewing. In each experiment, performance on a lexical decision task was measured following the brief presentation of a scene. The lexical decision stimulus named an object that was either present or not present in the scene. The results of Experiment 1 revealed no priming from inconsistent objects (whose identities conflicted with the scene in which they appeared), but negative priming from consistent objects. The results of Experiment 2 indicated that negative priming from consistent objects occurs only when inconsistent objects are present in the scenes. Together, the results suggest that observers are likely to attend to inconsistent objects, and that representations of consistent objects are suppressed in the presence of an inconsistent object. Furthermore, the data suggest that inconsistent objects draw attention because they are relatively difficult to identify in an inappropriate context.  相似文献   

A largely unexplored aspect of lexical access in visual word recognition is “semantic size”—namely, the real-world size of an object to which a word refers. A total of 42 participants performed a lexical decision task on concrete nouns denoting either big or small objects (e.g., bookcase or teaspoon). Items were matched pairwise on relevant lexical dimensions. Participants' reaction times were reliably faster to semantically “big” versus “small” words. The results are discussed in terms of possible mechanisms, including more active representations for “big” words, due to the ecological importance attributed to large objects in the environment and the relative speed of neural responses to large objects.  相似文献   

Functional magnetic resonance imaging (fMRI) has identified distinct brain regions in ventral occipitotemporal cortex (VOTC) and lateral occipitotemporal cortex (LOTC) that are differentially activated by pictures of faces and bodies. Recent work from our laboratory has shown that the strong LOTC activation evoked by bodies in which the face is occluded is attenuated when the occlusion is removed. We hypothesized that this attenuation may occur because subjects preferentially fixate upon faces when present in the scene. Here, we experimentally manipulated subjects’ fixations while they viewed a static picture of a character whose face, hand, and torso were continuously visible throughout each run. The subject’s saccades and fixations were guided by a small fixation cross that made discrete jumps to a new location every 500 ms. Subjects were instructed to follow the fixation cross and make a button press whenever it changed size. In a series of blocks, the fixation cross shifted from locations on the face, on the hand, and to locations on a background image of a phase-scrambled face. In a second study, the fixation cross moved similarly, but the hand locations were changed to locations along the character’s body or torso. A localizer task was used to identify face- and body-sensitive regions of LOTC. Body-sensitive regions were strongly activated when the subjects’ saccades were guided over the character’s torso relative to when the saccades were guided over the character’s face. Little to no activity occurred in the body-sensitive region of LOTC when the subjects’ saccades were guided over the character’s hand. The localizer task was unable to differentiate body-sensitive regions in lateral VOTC from face-sensitive regions, or body-sensitive regions in medial VOTC from flower-sensitive regions. Guided saccades over the body strongly activated both lateral and medial VOTC. These results provide new insights into the function of body-sensitive visual areas in both LOTC and VOTC, and illustrate the potential confounding influence of uncontrolled eye movements for neuroimaging studies of social perception.  相似文献   

Eye movements and scene perception.   总被引:11,自引:0,他引:11  
Research on eye movements and scene perception is reviewed. Following an initial discussion of some basic facts about eye movements and perception, the following topics are discussed: (I) the span of effective vision during scene perception, (2) the role of eye movements in scene perception, (3) integration of information across saccades, (4) scene context, object identification and eye movements, and (5) the control of eye movements. The relationship of eye movements during reading to eye movements during scene perception is considered. A preliminary model of eye movement control in scene perception is described and directions for future research are suggested.  相似文献   

Montreal, Toronto, and Vancouver have attracted most new immigrants to Canada. Small and medium-sized cities in Canada are keen to share the wealth that new immigrants represent, and federal and provincial governments support a more even distribution of settlement. As a result, the idea of attracting new immigrants to smaller locations is a pressing policy issue. This research weighs the characteristics of place that new immigrants consider on arrival. It uses findings from the Longitudinal Survey of Immigrants to Canada (Statistics Canada, 2003) to construct an index that ranks five medium-sized cities in British Columbia in terms of their potential attractiveness to new immigrants. The index created proves robust and reliable from a statistical viewpoint. The study confirms that immigrants are attracted to cities where friends and family or other immigrants live. Moreover, the increase in attractiveness of a city is primarily related to its size. The index is an indicator of the role that population and the extant number of immigrants in situ plays in determining the appeal of smaller cities. From a policy perspective, if governments wish to “spread the wealth” associated with immigration and an expanded labour force, a proactive policy stance that enumerates and communicates the appeal of less prominent communities is vital. This is an important finding, and we offer policy options that account for the relationship of population size to immigrant retention.  相似文献   

In this study, we examined the characteristics of on-line scene representations, using a partial-report procedure. Subjects inspected a simple scene containing seven objects for 1, 3, 5, 9, or 15 fixations; shortly after scene offset, a marker cued one scene location for report. Consistent with previous research, the results indicated that scene representations are relatively sparse; even after 15 fixations on a scene, the subjects remembered the position/identity pairings for only about 78% of the objects in the scene, or the equivalent of about five objects-worth of information. Report of the last three objects that were foveated and of the object about to be foveated was very accurate, however, suggesting that recently attended information in a scene is represented quite well. Information about the scene appeared to accumulate over multiple fixations, but the capacity of the on-line scene representation appeared to be limited to about five items. Implications for recent theories of scene representation are discussed.  相似文献   

Previous research suggested that perception of spatial location is biased towards spatial goals of planned hand movements. In the present study I show that an analogous perceptual distortion can be observed if attention is paid to a spatial location in the absence of planning a hand movement. Participants judged the position of a target during preparation of a mouse movement, the end point of which could deviate from the target by a varying degree in Exp. 1. Judgments of target position were systematically affected by movement characteristics consistent with perceptual assimilation between the target and the planned movement goal. This effect was neither due to an impact of motor execution on judgments (Exp. 2) nor due to characteristics of the movement cues or of certain target positions (Exp. 3, Exp. 5A). When the task included deployment of attention to spatial positions (former movement goals) in preparation for a secondary perceptual task, an effect emerged that was comparable with the bias associated with movement planning (Exp. 4, Exp. 5B). These results indicate that visual distortions accompanying manipulations of variables related to action could be mediated by attentional mechanisms.  相似文献   

Eye movements across advertisements express a temporal pattern of bursts of respectively relatively short and long saccades, and this pattern is systematically influenced by activated scene perception goals. This was revealed by a continuous-time hidden Markov model applied to eye movements of 220 participants exposed to 17 ads under a free-viewing condition, and a scene-learning goal (ad memorization), a scene-evaluation goal (ad appreciation), a target-learning goal (product learning), or a target-evaluation goal (product evaluation). The model reflects how attention switches between two states--local and global--expressed in saccades of shorter and longer amplitude on a spatial grid with 48 cells overlaid on the ads. During the 5- to 6-s duration of self-controlled exposure to ads in the magazine context, attention predominantly started in the local state and ended in the global state, and rapidly switched about 5 times between states. The duration of the local attention state was much longer than the duration of the global state. Goals affected the frequency of switching between attention states and the duration of the local, but not of the global, state.  相似文献   

The present study focuses on two aspects of the time course of visual information processing during the perception of natural scenes. The first aspect is the change of fixation duration and saccade amplitude during the first couple of seconds of the inspection period, as has been described by Buswell (), among others. This common effect suggests that the saccade amplitude and fixation duration are in some way controlled by the same mechanism. A simple exponential model containing two parameters can describe the phenomena quite satisfactorily. The parameters of the model show that saccade amplitude and fixation duration may be controlled by a common mechanism. The second aspect under scrutiny is the apparent lack of correlation between saccade amplitude and fixation duration (Viviani, ). The present study shows that a strong but nonlinear relationship between saccade amplitude and fixation duration does exist in picture viewing. A model, based on notions laid out by Findlay and Walker's () model of saccade generation and on the idea of two modes of visual processing (Trevarthen, ), was developed to explain this relationship. The model both fits the data quite accurately and can explain a number of related phenomena.  相似文献   

Eye movements during mental imagery are not epiphenomenal but assist the process of image generation. Commands to the eyes for each fixation are stored along with the visual representation and are used as spatial index in a motor‐based coordinate system for the proper arrangement of parts of an image. In two experiments, subjects viewed an irregular checkerboard or color pictures of fish and were subsequently asked to form mental images of these stimuli while keeping their eyes open. During the perceptual phase, a group of subjects was requested to maintain fixation onto the screen's center, whereas another group was free to inspect the stimuli. During the imagery phase, all of these subjects were free to move their eyes. A third group of subjects (in Experiment 2) was free to explore the pattern but was requested to maintain central fixation during imagery. For subjects free to explore the pattern, the percentage of time spent fixating a specific location during perception was highly correlated with the time spent on the same (empty) locations during imagery. The order of scanning of these locations during imagery was correlated to the original order during perception. The strength of relatedness of these scanpaths and the vividness of each image predicted performance accuracy. Subjects who fixed their gaze centrally during perception did the same spontaneously during imagery. Subjects free to explore during perception, but maintaining central fixation during imagery, showed decreased ability to recall the pattern. We conclude that the eye scanpaths during visual imagery reenact those of perception of the same visual scene and that they play a functional role.  相似文献   

The current experiments examined the hypothesis that scene structure affects time perception. In three experiments, participants judged the duration of realistic scenes that were presented in a normal or jumbled (i.e., incoherent) format. Experiment 1 demonstrated that the subjective duration of normal scenes was greater than subjective duration of jumbled scenes. In Experiment 2, gridlines were added to both normal and jumbled scenes to control for the number of line terminators, and scene structure had no effect. In Experiment 3, participants performed a secondary task that required paying attention to scene structure, and scene structure's effect on duration judgements reemerged. These findings are consistent with the idea that perceived duration can depend on visual–cognitive processing, which in turn depends on both the nature of the stimulus and the goals of the observer.  相似文献   

Albert MK 《Perception》1999,28(11):1347-1360
The visual perception of monocular stimuli perceived as 3-D objects has received considerable attention from researchers in human and machine vision. However, most previous research has focused on how individual 3-D objects are perceived. Here this is extended to a study of how the structure of 3-D scenes containing multiple, possibly disconnected objects and features is perceived. Da Vinci stereopsis, stereo capture, and other surface formation and interpolation phenomena in stereopsis and structure-from-motion suggest that small features having ambiguous depth may be assigned depth by interpolation with features having unambiguous depth. I investigated whether vision may use similar mechanisms to assign relative depth to multiple objects and features in sparse monocular images, such as line drawings, especially when other depth cues are absent. I propose that vision tends to organize disconnected objects and features into common surfaces to construct 3-D-scene interpretations. Interpolations that are too weak to generate a visible surface percept may still be strong enough to assign relative depth to objects within a scene. When there exists more than one possible surface interpolation in a scene, the visual system's preference for one interpolation over another seems to be influenced by a number of factors, including: (i) proximity, (ii) smoothness, (iii) a preference for roughly frontoparallel surfaces and 'ground' surfaces, (iv) attention and fixation, and (v) higher-level factors. I present a variety of demonstrations and an experiment to support this surface-formation hypothesis.  相似文献   

In the current study, we explored observers' use of two distinct analyses for determining their direction of motion, or heading: a scene-based analysis and a motion-based analysis. In two experiments, subjects viewed sequentially presented, paired digitized images of real-world scenes and judged the direction of heading; the pairs were presented with various interstimulus intervals (ISIs). In Experiment 1, subjects could determine heading when the two frames were separated with a 1,000-ms ISI, long enough to eliminate apparent motion. In Experiment 2, subjects performed two tasks, a path-of-motion task and a memory-load task, under three different ISIs, 50 ms, 500 ms, and 1,000 ms. Heading accuracy decreased with an increase in ISI. Increasing memory load influenced heading judgments only for the longer ISI when motion-based information was not available. These results are consistent with the hypothesis that the scene-based analysis has a coarse spatial representation, is a sustained temporal process, and is capacity limited, whereas the motion-based analysis has a fine spatial resolution, is a transient temporal process, and is capacity unlimited.  相似文献   

Humans are sensitive to complexity and regularity in patterns (Falk & Konold, 1997; Yamada, Kawabe, & Miyazaki, 2013). The subjective perception of pattern complexity is correlated to algorithmic (or Kolmogorov-Chaitin) complexity as defined in computer science (Li & Vitányi, 2008), but also to the frequency of naturally occurring patterns (Hsu, Griffiths, & Schreiber, 2010). However, the possible mediational role of natural frequencies in the perception of algorithmic complexity remains unclear. Here we reanalyze Hsu et al. (2010) through a mediational analysis, and complement their results in a new experiment. We conclude that human perception of complexity seems partly shaped by natural scenes statistics, thereby establishing a link between the perception of complexity and the effect of natural scene statistics.  相似文献   

康廷虎  薛西 《心理科学进展》2018,26(9):1617-1623
场景即我们生活于其中的真实环境, 社会场景是其重要组成部分。在社会场景知觉的研究中, 动作意图的识别既受场景背景信息的影响, 也与动作的客观对象有关。因此, 研究者可以根据背景-刺激物、刺激物-刺激物关系, 探索动作识别的影响机制; 另一方面, 也可以根据场景的语义约束和物理限制, 依据合理动作原则及其伴随的生理指标检测并识别动作意图。在机器视觉研究领域, 计算机识别模型为社会场景中动作意图的检测和识别提供了新的视角。在未来的研究中, 研究者需要考虑真实场景中动作意图识别能力的发展、动作意图识别的个体差异和文化差异等问题。  相似文献   

