共查询到20条相似文献,搜索用时 0 毫秒
1.
D W Massaro M M Cohen 《Journal of experimental psychology. Human perception and performance》1983,9(5):753-771
Three experiments were carried out to investigate the evaluation and integration of visual and auditory information in speech perception. In the first two experiments, subjects identified /ba/ or /da/ speech events consisting of high-quality synthetic syllables ranging from /ba/ to /da/ combined with a videotaped /ba/ or /da/ or neutral articulation. Although subjects were specifically instructed to report what they heard, visual articulation made a large contribution to identification. The tests of quantitative models provide evidence for the integration of continuous and independent, as opposed to discrete or nonindependent, sources of information. The reaction times for identification were primarily correlated with the perceived ambiguity of the speech event. In a third experiment, the speech events were identified with an unconstrained set of response alternatives. In addition to /ba/ and /da/ responses, the /bda/ and /tha/ responses were well described by a combination of continuous and independent features. This body of results provides strong evidence for a fuzzy logical model of perceptual recognition. 相似文献
2.
3.
Conflicting visual speech information can influence the perception of acoustic speech, causing an illusory percept of a sound not present in the actual acoustic speech (the McGurk effect). We examined whether participants can voluntarily selectively attend to either the auditory or visual modality by instructing participants to pay attention to the information in one modality and to ignore competing information from the other modality. We also examined how performance under these instructions was affected by weakening the influence of the visual information by manipulating the temporal offset between the audio and video channels (experiment 1), and the spatial frequency information present in the video (experiment 2). Gaze behaviour was also monitored to examine whether attentional instructions influenced the gathering of visual information. While task instructions did have an influence on the observed integration of auditory and visual speech information, participants were unable to completely ignore conflicting information, particularly information from the visual stream. Manipulating temporal offset had a more pronounced interaction with task instructions than manipulating the amount of visual information. Participants' gaze behaviour suggests that the attended modality influences the gathering of visual information in audiovisual speech perception. 相似文献
4.
Spontaneous beat gestures are an integral part of the paralinguistic context during face-to-face conversations. Here we investigated the time course of beat-speech integration in speech perception by measuring ERPs evoked by words pronounced with or without an accompanying beat gesture, while participants watched a spoken discourse. Words accompanied by beats elicited a positive shift in ERPs at an early sensory stage (before 100 ms) and at a later time window coinciding with the auditory component P2. The same word tokens produced no ERP differences when participants listened to the discourse without view of the speaker. We conclude that beat gestures are integrated with speech early on in time and modulate sensory/phonological levels of processing. The present results support the possible role of beats as a highlighter, helping the listener to direct the focus of attention to important information and modulate the parsing of the speech stream. 相似文献
5.
6.
7.
Dennis L. Molfese 《Brain and language》1980,11(2):285-299
Developmental research reporting electrophysiological correlates of voice onset time (VOT) during speech perception is reviewed. By two months of age a right hemisphere mechanism appears which differentiates voiced from voiceless stop consonants. This mechanism was found at 4 years of age and again with adults.A new study is described which represents an attempt to determine a more specific basis for VOT perception. Auditory evoked responses (AER) were recorded over the left and right hemispheres while 16 adults attended to repetitive series of two-tone stimuli. Portions of the AERs were found to vary systematically over the two hemispheres in a manner similar to that previously reported for VOT stimuli. These findings are discussed in terms of a temporal detection mechanism which is involved in speech perception. 相似文献
8.
Rachel M. Miller Kauyumari Sanchez Lawrence D. Rosenblum 《Attention, perception & psychophysics》2010,72(6):1614-1625
Speech alignment is the tendency for interlocutors to unconsciously imitate one another’s speaking style. Alignment also occurs when a talker is asked to shadow recorded words (e.g., Shockley, Sabadini, & Fowler, 2004). In two experiments, we examined whether alignment could be induced with visual (lipread) speech and with auditory speech. In Experiment 1, we asked subjects to lipread and shadow out loud a model silently uttering words. The results indicate that shadowed utterances sounded more similar to the model’s utterances than did subjects’ nonshadowed read utterances. This suggests that speech alignment can be based on visual speech. In Experiment 2, we tested whether raters could perceive alignment across modalities. Raters were asked to judge the relative similarity between a model’s visual (silent video) utterance and subjects’ audio utterances. The subjects’ shadowed utterances were again judged as more similar to the model’s than were read utterances, suggesting that raters are sensitive to cross-modal similarity between aligned words. 相似文献
9.
Current theories of consciousness assume a qualitative dissociation between conscious and unconscious processing: while subliminal stimuli only elicit a transient activity, supraliminal stimuli have long-lasting influences. Nevertheless, the existence of this qualitative distinction remains controversial, as past studies confounded awareness and stimulus strength (energy, duration). Here, we used a masked speech priming method in conjunction with a submillisecond interaural delay manipulation to contrast subliminal and supraliminal processing at constant prime, mask and target strength. This delay induced a perceptual streaming effect, with the prime popping out in the supraliminal condition. By manipulating the prime-target interval (ISI), we show a qualitatively distinct profile of priming longevity as a function of prime awareness. While subliminal priming disappeared after half a second, supraliminal priming was independent of ISI. This shows that the distinction between conscious and unconscious processing depends on high-level perceptual streaming factors rather than low-level features (energy, duration). 相似文献
10.
The cyclic variation in the energy envelope of the speech signal results from the production of speech in syllables. This acoustic property is often identified as a source of information in the perception of syllable attributes, though spectral variation can also provide this information reliably. In the present study of the relative contributions of the energy and spectral envelopes in speech perception, we employed sinusoidal replicas of utterances, which permitted us to examine the roles of these acoustic properties in establishing or maintaining time-varying perceptual coherence. Three experiments were carried out to assess the independent perceptual effects of variation in sinusoidal amplitude and frequency, using sentence-length signals. In Experiment 1, we found that the fine grain of amplitude variation was not necessary for the perception of segmental and suprasegmental linguistic attributes; in Experiment 2, we found that amplitude was nonetheless effective in influencing syllable perception, and that in some circumstances it was crucial to segmental perception; in Experiment 3, we observed that coarse-grain amplitude variation, above all, proved to be extremely important in phonetic perception. We conclude that in perceiving sinusoidal replicas, the perceiver derives much from following the coherent pattern of frequency variation and gross signal energy, but probably derives rather little from tracking the precise details of the energy envelope. These findings encourage the view that the perceiver uses time-varying acoustic properties selectively in understanding speech. 相似文献
11.
Twenty-seven patients with right-hemisphere damage (RBD) and thirty-one patients with left-hemisphere damage (LBD) received a new pragmatics battery in Hebrew consisting of two parts: (1) comprehension and production of basic speech acts (BSAs), including tests of assertions, questions, requests, and commands, and (2) comprehension of implicatures, including implicatures of quantity, quality, relevance, and manner. Each test had a verbal and a nonverbal version. Patients also received Hebrew versions of the Western Aphasia Battery and of the Right Hemisphere Communication Battery. Both LBD and RBD patients were impaired relative to controls but did not differ from each other in their overall scores on BSAs and on Implicatures when scores were corrected by aphasia and neglect indices. There was a systematic localization of BSAs in the left hemisphere (LH) but not in the right hemisphere (RH). There was poor localization of Implicatures in either hemisphere. In LBD patients, BSAs were associated with language functions measured with the WAB, suggesting the radical possibility that the classic localization of language functions in aphasia is influenced by the localization of the BSAs required by aphasia language tests. Both BSAs and implicatures show greater functional independence from other pragmatic, language and cognitive functions in the RBD than in the LBD patients. Thus, the LH is more likely to contain an unmodular domain-nonspecific set of central cognitive mechanisms for applying means-ends rationality principles to intentional activity. 相似文献
12.
Twenty right-handed kindergarten children with superior language skills and twenty with deficient language skills (as defined by performance on an elicited sentence repetition task) were tested (1) for hemispheric specialization for speech perception with a dichotic CV syllable task and (2) for relative manual proficiency by means of a battery of hand tasks. Reading readiness and aspects of other cognitive abilities were also assessed. The superior children evidenced a mean right-ear advantage of 14.5%, which is consistent with normal values reported by other investigators using the same stimuli. The language deficient group evidenced essentially no mean ear advantage (0.5) with half of these subjects exhibiting left-ear superiority. The findings suggest relationships among cerébral dominance, language proficiency (including reading readiness), and general cognitive functioning. 相似文献
13.
The role of visual information in the processing of place and manner features in speech perception 总被引:1,自引:0,他引:1
Visual information provided by a talker's mouth movements can influence the perception of certain speech features. Thus, the "McGurk effect" shows that when the syllable (bi) is presented audibly, in synchrony with the syllable (gi), as it is presented visually, a person perceives the talker as saying (di). Moreover, studies have shown that interactions occur between place and voicing features in phonetic perception, when information is presented audibly. In our first experiment, we asked whether feature interactions occur when place information is specificed by a combination of auditory and visual information. Members of an auditory continuum ranging from (ibi) to (ipi) were paired with a video display of a talker saying (igi). The auditory tokens were heard as ranging from (ibi) to (ipi), but the auditory-visual tokens were perceived as ranging from (idi) to (iti). The results demonstrated that the voicing boundary for the auditory-visual tokens was located at a significantly longer VOT value than the voicing boundary for the auditory continuum presented without the visual information. These results demonstrate that place-voice interactions are not limited to situations in which place information is specified audibly.(ABSTRACT TRUNCATED AT 250 WORDS) 相似文献
14.
Integral processing of visual place and auditory voicing information during phonetic perception. 总被引:1,自引:0,他引:1
K P Green P K Kuhl 《Journal of experimental psychology. Human perception and performance》1991,17(1):278-288
Results of auditory speech experiments show that reaction times (RTs) for place classification in a test condition in which stimuli vary along the dimensions of both place and voicing are longer than RTs in a control condition in which stimuli vary only in place. Similar results are obtained when subjects are asked to classify the stimuli along the voicing dimension. By taking advantage of the "McGurk" effect (McGurk & MacDonald, 1976), the present study investigated whether a similar pattern of interference extends to situations in which variation along the place dimension occurs in the visual modality. The results showed that RTs for classifying phonetic features in the test condition were significantly longer than in the control condition for the place and voicing dimensions. These results indicate a mutual and symmetric interference exists in the classification of the two dimensions, even when the variation along the dimensions occurs in separate modalities. 相似文献
15.
Because visual perception has temporal extent, temporally discontinuous input must be linked in memory. Recent research has suggested that this may be accomplished by integrating the active contents of visual short-term memory (VSTM) with subsequently perceived information. In the present experiments, we explored the relationship between VSTM consolidation and maintenance and eye movements, in order to discover how attention selects the information that is to be integrated. Specifically, we addressed whether stimuli needed to be overtly attended in order to be included in the memory representation or whether covert attention was sufficient. Results demonstrated that in static displays in which the to-be-integrated information was presented in the same spatial location, VSTM consolidation proceeded independently of the eyes, since subjects made few eye movements. In dynamic displays, however, in which the to-be-integrated information was presented in different spatial locations, eye movements were directly related to task performance. We conclude that these differences are related to different encoding strategies. In the static display case, VSTM was maintained in the same spatial location as that in which it was generated. This could apparently be accomplished with covert deployments of attention. In the dynamic case, however, VSTM was generated in a location that did not overlap with one of the to-be-integrated percepts. In order to "move" the memory trace, overt shifts of attention were required. 相似文献
16.
This study explored asymmetries for movement, expression and perception of visual speech. Sixteen dextral models were videoed as they articulated: 'bat,' 'cat,' 'fat,' and 'sat.' Measurements revealed that the right side of the mouth was opened wider and for a longer period than the left. The asymmetry was accentuated at the beginning and ends of the vocalization and was attenuated for words where the lips did not articulate the first consonant. To measure asymmetries in expressivity, 20 dextral observers watched silent videos and reported what was said. The model's mouth was covered so that the left, right or both sides were visible. Fewer errors were made when the right mouth was visible compared to the left--suggesting that the right side is more visually expressive of speech. Investigation of asymmetries in perception using mirror-reversed clips revealed that participants did not preferentially attend to one side of the speaker's face. A correlational analysis revealed an association between movement and expressivity whereby a more motile right mouth led to stronger visual expressivity of the right mouth. The asymmetries are most likely driven by left hemisphere specialization for language, which causes a rightward motoric bias. 相似文献
17.
To what extent is simultaneous visual and auditory perception subject to capacity limitations and attentional control? Two experiments addressed this question by asking observers to recognize test tones and test letters under selective and divided attention. In Experiment 1, both stimuli occurred on each trial, but subjects were cued in advance to process just one or both of the stimuli. In Experiment 2, subjects processed one stimulus and then the other or processed both stimuli simultaneously. Processing time was controlled using a backward recognition masking task. A significant, but small, attention effect was found in both experiments. The present positive results weaken the interpretation that previous attentional effects were due to the particular duration judgment task that was employed. The answer to the question addressed by the experiments appears to be that the degree of capacity limitations and attentional control during visual and auditory perception is small but significant. 相似文献
18.
A previous study (Ackermann, Gr?ber, Hertrich, & Daum, 1997) reported impaired phoneme identification in cerebellar disorders, provided that categorization depended on temporal cues. In order to further clarify the underlying mechanism of the observed deficit, the present study performed a discrimination and identification task in cerebellar patients using two-tone sequences of variable pause length. Cerebellar dysfunctions were found to compromise the discrimination of time intervals extending in duration from 10 to 150 ms, a range covering the length of acoustic speech segments. In contrast, categorization of the same stimuli as a "short" or "long pause" turned out to be unimpaired. These findings, along with the data of the previous investigation, indicate, first, that the cerebellum participates in the perceptual processing of speech and nonspeech stimuli and, second, that this organ might act as a back-up mechanism, extending the storage capacities of the "auditory analyzer" extracting temporal cues from acoustic signals. 相似文献
19.
William Prinzmetal 《Attention, perception & psychophysics》1981,30(4):330-340
Several recent theories of visual information processing have postulated that errors in recognition may result not only from a failure in feature extraction, but also from a failure to correctly join features after they have been correctly extracted. Errors that result from incorrectly integrating features are called conjunction errors. The present study uses conjunction errors to investigate the principles used by the visual system to integrate features. The research tests whether the visual system is more likely to integrate features located close together in visual space (the location principle) or whether the visual system is more likely to integrate features from stimulus items that come from the same perceptual group or object (the perceptual group principle). In four target-detection experiments, stimuli were created so that feature integration by the location principle and feature integration by the perceptual group principle made different predictions for performance. In all of the experiments, the perceptual group principle predicted feature integration even though the distance between stimulus items and retinal eccentricity were strictly controlled. 相似文献
20.
C A Fowler D J Dekle 《Journal of experimental psychology. Human perception and performance》1991,17(3):816-828
Three experiments investigated the "McGurk effect" whereby optically specified syllables experienced synchronously with acoustically specified syllables integrate in perception to determine a listener's auditory perceptual experience. Experiments contrasted the cross-modal effect of orthographic on acoustic syllables presumed to be associated in experience and memory with that of haptically experienced and acoustic syllables presumed not to be associated. The latter pairing gave rise to cross-modal influences when Ss were informed that cross-modal syllables were paired independently. Mouthed syllables affected reports of simultaneously heard syllables (and vice versa). These effects were absent when syllables were simultaneously seen (spelled) and heard. The McGurk effect does not arise from association in memory but from conjoint near specification of the same causal source in the environment--in speech, the moving vocal tract producing phonetic gestures. 相似文献