首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Two experiments explored the mapping between language and mental representations of visual scenes. In both experiments, participants viewed, for example, a scene depicting a woman, a wine glass and bottle on the floor, an empty table, and various other objects. In Experiment 1, participants concurrently heard either ‘The woman will put the glass on the table’ or ‘The woman is too lazy to put the glass on the table’. Subsequently, with the scene unchanged, participants heard that the woman ‘will pick up the bottle, and pour the wine carefully into the glass.’ Experiment 2 was identical except that the scene was removed before the onset of the spoken language. In both cases, eye movements after ‘pour’ (anticipating the glass) and at ‘glass’ reflected the language-determined position of the glass, as either on the floor, or moved onto the table, even though the concurrent (Experiment 1) or prior (Experiment 2) scene showed the glass in its unmoved position on the floor. Language-mediated eye movements thus reflect the real-time mapping of language onto dynamically updateable event-based representations of concurrently or previously seen objects (and their locations).  相似文献   

2.
Two experiments investigated sensory/motor-based functional knowledge of man-made objects: manipulation features associated with the actual usage of objects. In Experiment 1, a series of prime-target pairs was presented auditorily, and participants were asked to make a lexical decision on the target word. Participants made a significantly faster decision about the target word (e.g. ‘typewriter’) following a related prime that shared manipulation features with the target (e.g. ‘piano’) than an unrelated prime (e.g. ‘blanket’). In Experiment 2, participants' eye movements were monitored when they viewed a visual display on a computer screen while listening to a concurrent auditory input. Participants were instructed to simply identify the auditory input and touch the corresponding object on the computer display. Participants fixated an object picture (e.g. “typewriter”) related to a target word (e.g. ‘piano’) significantly more often than an unrelated object picture (e.g. “bucket”) as well as a visually matched control (e.g. “couch”). Results of the two experiments suggest that manipulation knowledge of words is retrieved without conscious effort and that manipulation knowledge constitutes a part of the lexical-semantic representation of objects.  相似文献   

3.
In the visual world paradigm, participants are more likely to fixate a visual referent that has some semantic relationship with a heard word, than they are to fixate an unrelated referent [Cooper, R. M. (1974). The control of eye fixation by the meaning of spoken language. A new methodology for the real-time investigation of speech perception, memory, and language processing. Cognitive Psychology, 6, 813-839]. Here, this method is used to examine the psychological validity of models of high-dimensional semantic space. The data strongly suggest that these corpus-based measures of word semantics predict fixation behavior in the visual world and provide further evidence that language-mediated eye movements to objects in the concurrent visual environment are driven by semantic similarity rather than all-or-none categorical knowledge. The data suggest that the visual world paradigm can, together with other methodologies, converge on the evidence that may help adjudicate between different theoretical accounts of the psychological semantics.  相似文献   

4.
Recent converging evidence suggests that language and vision interact immediately in non-trivial ways, although the exact nature of this interaction is still unclear. Not only does linguistic information influence visual perception in real-time, but visual information also influences language comprehension in real-time. For example, in visual search tasks, incremental spoken delivery of the target features (e.g., “Is there a red vertical?”) can increase the efficiency of conjunction search because only one feature is heard at a time. Moreover, in spoken word recognition tasks, the visual presence of an object whose name is similar to the word being spoken (e.g., a candle present when instructed to “pick up the candy”) can alter the process of comprehension. Dense sampling methods, such as eye-tracking and reach-tracking, richly illustrate the nature of this interaction, providing a semi-continuous measure of the temporal dynamics of individual behavioral responses. We review a variety of studies that demonstrate how these methods are particularly promising in further elucidating the dynamic competition that takes place between underlying linguistic and visual representations in multimodal contexts, and we conclude with a discussion of the consequences that these findings have for theories of embodied cognition.  相似文献   

5.
Different kinds of speech sounds are used to signify possible word forms in every language. For example, lexical stress is used in Spanish (/‘be.be/, ‘he/she drinks’ versus /be.’be/, ‘baby’), but not in French (/‘be.be/ and /be.’be/ both mean ‘baby’). Infants learn many such native language phonetic contrasts in their first year of life, likely using a number of cues from parental speech input. One such cue could be parents’ object labeling, which can explicitly highlight relevant contrasts. Here we ask whether phonetic learning from object labeling is abstract—that is, if learning can generalize to new phonetic contexts. We investigate this issue in the prosodic domain, as the abstraction of prosodic cues (like lexical stress) has been shown to be particularly difficult. One group of 10-month-old French-learners was given consistent word labels that contrasted on lexical stress (e.g., Object A was labeled /‘ma.bu/, and Object B was labeled /ma.’bu/). Another group of 10-month-olds was given inconsistent word labels (i.e., mixed pairings), and stress discrimination in both groups was measured in a test phase with words made up of new syllables. Infants trained with consistently contrastive labels showed an earlier effect of discrimination compared to infants trained with inconsistent labels. Results indicate that phonetic learning from object labeling can indeed generalize, and suggest one way infants may learn the sound properties of their native language(s).  相似文献   

6.
This study examined whether children use prosodic correlates to word meaning when interpreting novel words. For example, do children infer that a word spoken in a deep, slow, loud voice refers to something larger than a word spoken in a high, fast, quiet voice? Participants were 4- and 5-year-olds who viewed picture pairs that varied along a single dimension (e.g., big vs. small flower) and heard a recorded voice asking them, for example, “Can you get the blicket one?” spoken with either meaningful or neutral prosody. The 4-year-olds failed to map prosodic cues to their corresponding meaning, whereas the 5-year-olds succeeded (Experiment 1). However, 4-year-olds successfully mapped prosodic cues to word meaning following a training phase that reinforced children’s attention to prosodic information (Experiment 2). These studies constitute the first empirical demonstration that young children are able to use prosody-to-meaning correlates as a cue to novel word interpretation.  相似文献   

7.
Thinking about the abstract concept power may automatically activate the spatial up-down image schema (powerful up; powerless down) and consequently direct spatial attention to the image schema-congruent location. Participants indicated whether a word represented a powerful or powerless person (e.g. ‘king’ or ‘servant’). Following each decision, they identified a target at the top or bottom of the visual field. In Experiment 1 participants identified the target faster when their spatial position was congruent with the perceived power of the preceding word than when it was incongruent. In Experiment 2 ERPs showed a higher N1 amplitude for congruent spatial positions. These results support the view that attention is driven to the image schema congruent location of a power word. Thus, power is partially understood in terms of vertical space, which demonstrates that abstract concepts are grounded in sensory-motor processing.  相似文献   

8.
Due to extensive variability in the phonetic realizations of words, there may be few or no proximal spectro-temporal cues that identify a word’s onset or even its presence. Dilley and Pitt (2010) showed that the rate of context speech, distal from a to-be-recognized word, can have a sizeable effect on whether or not a word is perceived. This investigation considered whether there is a distinct role for distal rhythm in the disappearing word effect. Listeners heard sentences that had a grammatical interpretation with or without a critical function word (FW) and transcribed what they heard (e.g., are in Jill got quite mad when she heard there are birds can be removed and Jill got quite mad when she heard their birds is still grammatical). Consistent with a perceptual grouping hypothesis, participants were more likely to report critical FWs when distal rhythm (repeating ternary or binary pitch patterns) matched the rhythm in the FW-containing region than when it did not. Notably, effects of distal rhythm and distal rate were additive. Results demonstrate a novel effect of distal rhythm on the amount of lexical material listeners hear, highlighting the importance of distal timing information and providing new constraints for models of spoken word recognition.  相似文献   

9.
The delay between the signal to move the eyes, and the execution of the corresponding eye movement, is variable, and skewed; with an early peak followed by a considerable tail. This skewed distribution renders the answer to the question “What is the delay between language input and saccade execution?” problematic; for a given task, there is no single number, only a distribution of numbers. Here, two previously published studies are reanalysed, whose designs enable us to answer, instead, the question: How long does it take, as the language unfolds, for the oculomotor system to demonstrate sensitivity to the distinction between “signal” (eye movements due to the unfolding language) and “noise” (eye movements due to extraneous factors)? In two studies, participants heard either ‘the man…’ or ‘the girl…’, and the distribution of launch times towards the concurrently, or previously, depicted man in response to these two inputs was calculated. In both cases, the earliest discrimination between signal and noise occurred at around 100 ms. This rapid interplay between language and oculomotor control is most likely due to cancellation of about-to-be executed saccades towards objects (or their episodic trace) that mismatch the earliest phonological moments of the unfolding word.  相似文献   

10.
Two visual world experiments investigated the activation of semantically related concepts during the processing of environmental sounds and spoken words. Participants heard environmental sounds such as barking or spoken words such as “puppy” while viewing visual arrays with objects such as a bone (semantically related competitor) and candle (unrelated distractor). In Experiment 1, a puppy (target) was also included in the visual array; in Experiment 2, it was not. During both types of auditory stimuli, competitors were fixated significantly more than distractors, supporting the coactivation of semantically related concepts in both cases; comparisons of the two types of auditory stimuli also revealed significantly larger effects with environmental sounds than spoken words. We discuss implications of these results for theories of semantic knowledge.  相似文献   

11.
This study investigated whether or not the temporal information encoded in aspectual morphemes can be used immediately by young children to facilitate event recognition during online sentence comprehension. We focused on the contrast between two grammatical aspectual morphemes in Mandarin Chinese, the perfective morpheme –le and the (imperfective) durative morpheme –zhe. The perfective morpheme –le is often used to indicate that an event has been completed, whereas the durative morpheme –zhe indicates that an event is still in progress or continuing. We were interested to see whether young children are able to use the temporal reference encoded in the two aspectual morphemes (i.e., completed versus ongoing) as rapidly as adults to facilitate event recognition during online sentence comprehension. Using the visual world eye-tracking paradigm, we tested 34 Mandarin-speaking adults and 99 Mandarin-speaking children (35 three-year-olds, 32 four-year-olds and 32 five-year-olds). On each trial, participants were presented with spoken sentences containing either of the two aspectual morphemes while viewing a visual image containing two pictures, one representing a completed event and one representing an ongoing event. Participants’ eye movements were recorded from the onset of the spoken sentences. The results show that both the adults and the three age groups of children exhibited a facilitatory effect trigged by the aspectual morpheme: hearing the perfective morpheme –le triggered more eye movements to the completed event area, whereas hearing the durative morpheme –zhe triggered more eye movements to the ongoing event area. This effect occurred immediately after the onset of the aspectual morpheme, both for the adults and the three groups of children. This is evidence that young children are able to use the temporal information encoded in aspectual morphemes as rapidly as adults to facilitate event recognition. Children’s eye movement patterns reflect a rapid mapping of grammatical aspect onto the temporal structures of events depicted in the visual scene.  相似文献   

12.
Recent research on grapheme-colour synesthesia has focused on whether visual attention is necessary to induce a synesthetic percept. The current study investigated the influence of synesthesia on overt visual attention during an oculomotor target selection task. Chromatic and achromatic stimuli were presented with one target among distractors (e.g. a ‘2’ (target) among multiple ‘5’s (distractors)). Participants executed an eye movement to the target. Synesthetes and controls showed a comparable target selection performance across conditions and a ‘pop-out effect’ was only seen in the chromatic condition. As a pop-out effect was absent for the synesthetes in the achromatic condition, a synesthetic element appears not to elicit a synesthetic colour, even when it is the target. The synesthetic percepts are not pre-attentively available to distinguish the synesthetic target from synesthetic distractors when elements are presented in the periphery. Synesthesia appears to require full recognition to bind form and colour.  相似文献   

13.
English and German children aged 2 years 4 months and 4 years heard both novel and familiar verbs in sentences whose form was grammatical, but which mismatched the event they were watching (e.g., ‘The frog is pushing the lion’, when the lion was actually the ‘agent’ or ‘doer’ of the pushing). These verbs were then elicited in new sentences. All children mostly corrected the familiar verb (i.e., they used the agent as the grammatical subject), but there were cross-linguistic differences among the two-year-olds concerning the novel verb. When English 2-year-olds used the novel verb they mostly corrected. However, their most frequent response was to avoid using the novel verb altogether. German 2-year-olds corrected the novel verb significantly more often than their English counterparts, demonstrating more robust verb-general representations of agent- and patient-marking. These findings provide support for a ‘graded representations’ view of development, which proposes that grammatical representations may be simultaneously abstract but ‘weak’.  相似文献   

14.
The sounds that make up spoken words are heard in a series and must be mapped rapidly onto words in memory because their elements, unlike those of visual words, cannot simultaneously exist or persist in time. Although theories agree that the dynamics of spoken word recognition are important, they differ in how they treat the nature of the competitor set-precisely which words are activated as an auditory word form unfolds in real time. This study used eye tracking to measure the impact over time of word frequency and 2 partially overlapping competitor set definitions: onset density and neighborhood density. Time course measures revealed early and continuous effects of frequency (facilitatory) and on set based similarity (inhibitory). Neighborhood density appears to have early facilitatory effects and late inhibitory effects. The late inhibitory effects are due to differences in the temporal distribution of similarity within neighborhoods. The early facilitatory effects are due to subphonemic cues that inform the listener about word length before the entire word is heard. The results support a new conception of lexical competition neighborhoods in which recognition occurs against a background of activated competitors that changes over time based on fine-grained goodness-of-fit and competition dynamics.  相似文献   

15.
Across languages, children map words to meaning with great efficiency, despite a seemingly unconstrained space of potential mappings. The literature on how children do this is primarily limited to spoken language. This leaves a gap in our understanding of sign language acquisition, because several of the hypothesized mechanisms that children use are visual (e.g., visual attention to the referent), and sign languages are perceived in the visual modality. Here, we used the Human Simulation Paradigm in American Sign Language (ASL) to determine potential cues to word learning. Sign-naïve adult participants viewed video clips of parent–child interactions in ASL, and at a designated point, had to guess what ASL sign the parent produced. Across two studies, we demonstrate that referential clarity in ASL interactions is characterized by access to information about word class and referent presence (for verbs), similarly to spoken language. Unlike spoken language, iconicity is a cue to word meaning in ASL, although this is not always a fruitful cue. We also present evidence that verbs are highlighted well in the input, relative to spoken English. The results shed light on both similarities and differences in the information that learners may have access to in acquiring signed versus spoken languages.  相似文献   

16.
Linguistically mediated visual search   总被引:1,自引:0,他引:1  
During an individual's normal interaction with the environment and other humans, visual and linguistic signals often coincide and can be integrated very quickly. This has been clearly demonstrated in recent eyetracking studies showing that visual perception constrains on-line comprehension of spoken language. In a modified visual search task, we found the inverse, that real-time language comprehension can also constrain visual perception. In standard visual search tasks, the number of distractors in the display strongly affects search time for a target defined by a conjunction of features, but not for a target defined by a single feature. However, we found that when a conjunction target was identified by a spoken instruction presented concurrently with the visual display, the incremental processing of spoken language allowed the search process to proceed in a manner considerably less affected by the number of distractors. These results suggest that perceptual systems specialized for language and for vision interact more fluidly than previously thought.  相似文献   

17.
We investigated the spontaneous activation of phonologically related words in high and low proficient Hindi–English bilinguals during spoken word processing in an eye-tracking study. Participants listened to spoken words in L1/L2 and looked at a display (consisting of line drawings of phonological cohort of the translation equivalent of the spoken word and unrelated distractors). Both the groups were quick in orienting their attention towards the competitor with the onset of the spoken word. Furthermore, high proficient bilinguals showed higher and earlier activation of the competitor compared to low proficient bilinguals. Cross-language activations were higher in the L2–L1 direction for both the groups. The results strongly suggest language non-selective access of translation in Hindi–English bilinguals in both language directions. We discuss the results with regard to the predictions of the bilingual language processing models and the effect of language proficiency on conceptual access during listening in bilinguals.  相似文献   

18.
Participants saw a small number of objects in a visual display and performed a visual detection or visual-discrimination task in the context of task-irrelevant spoken distractors. In each experiment, a visual cue was presented 400 ms after the onset of a spoken word. In experiments 1 and 2, the cue was an isoluminant color change and participants generated an eye movement to the target object. In experiment 1, responses were slower when the spoken word referred to the distractor object than when it referred to the target object. In experiment 2, responses were slower when the spoken word referred to a distractor object than when it referred to an object not in the display. In experiment 3, the cue was a small shift in location of the target object and participants indicated the direction of the shift. Responses were slowest when the word referred to the distractor object, faster when the word did not have a referent, and fastest when the word referred to the target object. Taken together, the results demonstrate that referents of spoken words capture attention.  相似文献   

19.
The approach/avoidance effect refers to the finding that valenced stimuli trigger approach and avoidance actions. Markman and Brendl [Markman, A. B., & Brendl, M. (2005). Constraining theories of embodied cognition. Psychological Science, 16, 6-16] argued that this effect is not a truly embodied phenomenon, but depends on participants’ symbolic representation of the self.In their study, participants moved valenced words toward or away from their own name on the computer screen. This would induce participants to form a ‘disembodied’ self-representation at the location of their name, outside of the body. Approach/avoidance effects occurred with respect to the participant’s name, rather than with respect to the body.In three experiments, we demonstrate that similar effects are found when the name is replaced by a positive word, a negative word or even when no word is presented at all. This suggests that the ‘disembodied self’ explanation of Markman and Brendl is incorrect, and that their findings do not necessarily constrain embodied theories of cognition.  相似文献   

20.
Listeners infer which object in a visual scene a speaker refers to from the systematic variation of the speaker's tone of voice (ToV). We examined whether ToV also guides word learning. During exposure, participants heard novel adjectives (e.g., “daxen”) spoken with a ToV representing hot, cold, strong, weak, big, or small while viewing picture pairs representing the meaning of the adjective and its antonym (e.g., elephant–ant for big–small). Eye fixations were recorded to monitor referent detection and learning. During test, participants heard the adjectives spoken with a neutral ToV, while selecting referents from familiar and unfamiliar picture pairs. Participants were able to learn the adjectives' meanings, and, even in the absence of informative ToV, generalize them to new referents. A second experiment addressed whether ToV provides sufficient information to infer the adjectival meaning or needs to operate within a referential context providing information about the relevant semantic dimension. Participants who saw printed versions of the novel words during exposure performed at chance during test. ToV, in conjunction with the referential context, thus serves as a cue to word meaning. ToV establishes relations between labels and referents for listeners to exploit in word learning.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号