首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Morton, Marcus, and Frankish (1976) defined “perceptual center,” or “P-center,” as a neutral term to describe that which is regular in a perceptually regular sequence of speech sounds. This paper describes a paradigm for the determination of P-center location and the effect of various acoustic parameters on empirically determined P-center locations. It is shown that P-center location is affected by both initial consonant duration and, secondarily, subsequent vowel and consonant duration. A simple two-parameter model involving the duration of the whole stimulus is developed and gives good performance in predicting P-center location. The application of this model to continuous speech is demonstrated. It is suggested that there is little value in attempting to determine any single acoustic or articulatory correlate of P-center location, or in attempting to define P-center location absolutely in time. Rather, these results indicate that P-centers are a property of the whole stimulus and reflect properties of both the production and perception of speech.  相似文献   

2.
The multistable perception of speech, or verbal transformation effect, refers to perceptual changes experienced while listening to a speech form that is repeated rapidly and continuously. In order to test whether visual information from the speaker's articulatory gestures may modify the emergence and stability of verbal auditory percepts, subjects were instructed to report any perceptual changes during unimodal, audiovisual, and incongruent audiovisual presentations of distinct repeated syllables. In a first experiment, the perceptual stability of reported auditory percepts was significantly modulated by the modality of presentation. In a second experiment, when audiovisual stimuli consisting of a stable audio track dubbed with a video track that alternated between congruent and incongruent stimuli were presented, a strong correlation between the timing of perceptual transitions and the timing of video switches was found. Finally, a third experiment showed that the vocal tract opening onset event provided by the visual input could play the role of a bootstrap mechanism in the search for transformations. Altogether, these results demonstrate the capacity of visual information to control the multistable perception of speech in its phonetic content and temporal course. The verbal transformation effect thus provides a useful experimental paradigm to explore audiovisual interactions in speech perception.  相似文献   

3.
The approximately 20-msec perceptual threshold for identifying order of onset for components of auditory stimuli has been considered both as a possible factor contributing to the perception of voicing contrasts in speech and as no more than a methodological artifact. In the present research, we investigate the identification of the temporal order of onset of spectral components in terms of the first of a sequence of thresholds for complex stimuli (modeled after consonant-vowel [CV] syllables) that vary in degree of onset. The results provide clear evidence that the difference limen (DL) for discriminating differences in onset time follows predictions based on a fixed perceptual threshold or limit at relatively short onset differences. Furthermore, the DL seems to be a function of context coding of stimulus information, with both the DL and absolute threshold probably reflecting limits on the effective perception and coding of the short-term stimulus spectrum.  相似文献   

4.
Sound symbolism refers to non-arbitrary mappings between the sounds of words and their meanings and is often studied by pairing auditory pseudowords such as “maluma” and “takete” with rounded and pointed visual shapes, respectively. However, it is unclear what auditory properties of pseudowords contribute to their perception as rounded or pointed. Here, we compared perceptual ratings of the roundedness/pointedness of large sets of pseudowords and shapes to their acoustic and visual properties using a novel application of representational similarity analysis (RSA). Representational dissimilarity matrices (RDMs) of the auditory and visual ratings of roundedness/pointedness were significantly correlated crossmodally. The auditory perceptual RDM correlated significantly with RDMs of spectral tilt, the temporal fast Fourier transform (FFT), and the speech envelope. Conventional correlational analyses showed that ratings of pseudowords transitioned from rounded to pointed as vocal roughness (as measured by the harmonics-to-noise ratio, pulse number, fraction of unvoiced frames, mean autocorrelation, shimmer, and jitter) increased. The visual perceptual RDM correlated significantly with RDMs of global indices of visual shape (the simple matching coefficient, image silhouette, image outlines, and Jaccard distance). Crossmodally, the RDMs of the auditory spectral parameters correlated weakly but significantly with those of the global indices of visual shape. Our work establishes the utility of RSA for analysis of large stimulus sets and offers novel insights into the stimulus parameters underlying sound symbolism, showing that sound-to-shape mapping is driven by acoustic properties of pseudowords and suggesting audiovisual cross-modal correspondence as a basis for language users' sensitivity to this type of sound symbolism.  相似文献   

5.
Auditory hallucination is a key characteristic of schizophrenia that seriously debilitates the patient, with consequences for social engagement with others. Hallucinatory experiences are also observed in healthy individuals in the general population who report “hearing voices” in the absence of an external acoustic input. A view on auditory hallucinations and “hearing voices” is presented that regards such phenomena as perceptual processes, originating from speech perception areas in the left temporal lobe. Healthy individuals “hearing voices” are, however, often aware that the experience comes from inner thought processes, which is not reported by hallucinating patients. A perceptual model can therefore, not alone explain the difference in the phenomenology of how the “voices heard” are attributed to either an inner or outer cause. An expanded model is thus presented which takes into account top‐down cognitive control, localized to prefrontal cortical areas, to inhibit and re‐attribute the perceptual mis‐representations. The expanded model is suggested to be empirically validated using a dichotic listening speech perception paradigm with instructions for top‐down control of attention focus to either the right or left side in auditory space. It is furthermore suggested to use fMRI to validate the temporal and frontal lobe neuronal correlates of the cognitive processes involved in auditory hallucinations.  相似文献   

6.
Neurological and behavioral findings indicate that atypical auditory processing characterizes autism. The present study tested the hypothesis that auditory processing is less domain-specific in autism than in typical development. Participants with autism and controls completed a pitch sequence discrimination task in which same/different judgments of music and/or speech stimulus pairs were made. A signal detection analysis showed no difference in pitch sensitivity across conditions in the autism group, while controls exhibited significantly poorer performance in conditions incorporating speech. The results are largely consistent with perceptual theories of autism, which propose that a processing bias towards featural/low-level information characterizes the disorder, as well as supporting the notion that such individuals exhibit selective attention to a limited number of simultaneously presented cues.  相似文献   

7.
In three separate experiments, Ss were provided with auditory, visual, or simultaneous auditory and visual information in a classification task. Difficulty of classification was manipulated by varying the stimulus exposure duration. Consistent bisensory facilitation effects were noted for later trials, with interference evident on earlier trials. Exposure duration influenced rate and not amount of learning, with bisensory performance being most affected by duration. A transfer paradigm was used in Experiment III, and little if any transfer was noted between unisensory and bisensory stimulus conditions. It was concluded that Ss were extracting the most salient bisensory stimulus components from the auditory and visual modes of information into a unidimensional information configuration.  相似文献   

8.
Event-related brain potentials (ERPs) were used to determine whether low left-hemisphere arousal or unusual cortical responses to speech stimuli might be associated with anomalies in language function that reportedly occur when psychopaths perform lateralized information-processing tasks. ERPs to phonemic stimuli were recorded while 11 psychopathic (P) and 13 nonpsychopathic (NP) male prison inmates performed a Single-Task and a Dual-Task. In the Single-Task, a speech discrimination ‘oddball’ paradigm, the subject was required to respond whenever a target stimulus (the less frequent of two phonemes) occured. In the Dual-Task, he had to respond to target stimuli while simultaneously performing a perceptual-motor (distractor) task. There were no group differences in ERP measures of central arousal (N100) during performance of the Single- and Dual-Tasks. For both groups, the P300 component of the ERP to the target stimulus was smaller and had longer latency during the Dual-Task than during the Single-Task, indicating that in the Dual-Task phonemic discrimination and the perceptual-motor task completed for similar perceptual resources. Overlapping Group P's P300 responses to the target stimulus during the Dual-Task was a vertex and asymmetric (left-hemisphere) positive slow wave (SW), suggesting unusual speech processing in psychopaths under conditions of distraction, perhaps related to reduced sensitivity to the sequential probabilities associated with events presented in an auditory channel. The results were consistent with the hypothesis that psychopaths have limited left-hemisphere resources for processing linguistic stimuli.  相似文献   

9.
Numerous investigators have reported that listeners are able to perceptually differentiate adult stutterers' and nonstutterers' fluent speech productions. However, findings from similar studies with children ranging in age from 3 to 9 yr have indicated that perceptual discrimination of child stutterers is difficult. A logical extension of this line of investigation would be to determine when during maturation from childhood to adulthood stutterers' fluent speech becomes perceptibly different than nonstutterers'. Therefore, in this study similar fluent speech samples from seven 12–16-yr-old adolescent male stutterers and seven matched nonstutterers were analyzed perceptually in a paired stimulus paradigm by 15 sophisticated listeners. Individual subject analyses using signal detection theory revealed that five of the seven stutterers were discriminated. When averaged for subject group comparison, these findings indicated that listeners successfully discriminated between the fluent speech of the two groups. Therefore, the perceptual difference in fluent speech production reported previously for adults appears to be present by adolescence.  相似文献   

10.
Research in perceptual decision making is dominated by paradigms that tap the visual system, such as the random-dot motion (RDM) paradigm. In this study, we investigated whether the behavioral signature of perceptual decisions in the auditory domain is similar to those observed in the visual domain. We developed an auditory version of the RDM task, in which tones correspond to dots and pitch corresponds to motion (the random-tone pitch task, RTP). In this task, participants have to decide quickly whether the pitch of a “sound cloud” of tones is moving up or down. Stimulus strength and speed–accuracy trade-off were manipulated. To describe the relationship between stimulus strength and performance, we fitted the proportional-rate diffusion model to the data. The results showed a close coupling between stimulus strength and the speed and accuracy of perceptual decisions in both tasks. Additionally, we fitted the full drift diffusion model (DDM) to the data and showed that three of the four participants had similar speed–accuracy trade-offs in both tasks. However, for the RTP task, drift rates were larger and nondecision times slower, suggesting that some DDM parameters might be dependent on stimulus modality (drift rate and nondecision time), whereas others might not be (decision bound). The results illustrate that the RTP task is suitable for investigating the dynamics of auditory perceptual choices. Future studies using the task might help to investigate modality-specific effects on decision making at both the behavioral and neuronal levels.  相似文献   

11.
Backward masking, the suffix effect, and preperceptual storage   总被引:1,自引:0,他引:1  
This article considers the use of auditory backward recognition masking (ABRM) and stimulus suffix experiments as indexes of preperceptual auditory storage. In the first part of the article, two ABRM experiments that failed to demonstrate a mask disinhibition effect found previously in stimulus suffix experiments are reported. The failure to demonstrate mask disinhibition is inconsistent with an explanation of ABRM in terms of lateral inhibition. In the second part of the article, evidence is presented to support the conclusion that the suffix effect involves the contributions of later processing stages and does not provide an uncontaminated index of preperceptual storage. In contrast, it is claimed that ABRM experiments provide the most direct index of the temporal course of perceptual recognition. Partial-report tasks and other paradigms are also evaluated in terms of their contributions to an understanding of preperceptual auditory storage. Differences between interruption and integration masking are discussed along with the role of preperceptual auditory storage in speech perception.  相似文献   

12.
Speech perception is an ecologically important example of the highly context-dependent nature of perception; adjacent speech, and even nonspeech, sounds influence how listeners categorize speech. Some theories emphasize linguistic or articulation-based processes in speech-elicited context effects and peripheral (cochlear) auditory perceptual interactions in non-speech-elicited context effects. The present studies challenge this division. Results of three experiments indicate that acoustic histories composed of sine-wave tones drawn from spectral distributions with different mean frequencies robustly affect speech categorization. These context effects were observed even when the acoustic context temporally adjacent to the speech stimulus was held constant and when more than a second of silence or multiple intervening sounds separated the nonlinguistic acoustic context and speech targets. These experiments indicate that speech categorization is sensitive to statistical distributions of spectral information, even if the distributions are composed of nonlinguistic elements. Acoustic context need be neither linguistic nor local to influence speech perception.  相似文献   

13.
In the McGurk effect, perceptual identification of auditory speech syllables is influenced by simultaneous presentation of discrepant visible speech syllables. This effect has been found in subjects of different ages and with various native language backgrounds. But no McGurk tests have been conducted with prelinguistic infants. In the present series of experiments, 5-month-old English-exposed infants were tested for the McGurk effect. Infants were first gaze-habituated to an audiovisual /va/. Two different dishabituation stimuli were then presented: audio /ba/-visual /va/ (perceived by adults as /va/), and audio /da/-visual /va/ (perceived by adults as /da/). The infants showed generalization from the audiovisual /va/ to the audio /ba/-visual /va/ stimulus but not to the audio /da/-visual /va/ stimulus. Follow-up experiments revealed that these generalization differences were not due to a general preference for the audio /da/-visual /va/ stimulus or to the auditory similarity of /ba/ to /va/ relative to /da/. These results suggest that the infants were visually influenced in the same way as Englishspeaking adults are visually influenced.  相似文献   

14.
Crossmodal selective attention was investigated in a cued task switching paradigm using bimodal visual and auditory stimulation. A cue indicated the imperative modality. Three levels of spatial S–R associations were established following perceptual (location), structural (numerical), and conceptual (verbal) set-level compatibility. In Experiment 1, participants switched attention between the auditory and visual modality either with a spatial-location or spatial-numerical stimulus set. In the spatial-location set, participants performed a localization judgment on left vs. right presented stimuli, whereas the spatial-numerical set required a magnitude judgment about a visually or auditorily presented number word. Single-modality blocks with unimodal stimuli were included as a control condition. In Experiment 2, the spatial-numerical stimulus set was replaced by a spatial-verbal stimulus set using direction words (e.g., “left”). RT data showed modality switch costs, which were asymmetric across modalities in the spatial-numerical and spatial-verbal stimulus set (i.e., larger for auditory than for visual stimuli), and congruency effects, which were asymmetric primarily in the spatial-location stimulus set (i.e., larger for auditory than for visual stimuli). This pattern of effects suggests task-dependent visual dominance.  相似文献   

15.
The verbal transformation effect, an auditory illusion in which physically invariant repetitive verbal input undergoes perceptual transformation, has traditionally been interpreted as a speech-specific phenomenon. Experiment 1 showed that the effect is not limited to speech, but occurs in non-speech categories such as music and other complex everyday sounds, with transformations being comparable in nature and number to those in speech. Experiment 2 provided evidence for an alternative, broader-based view of the phenomenon, involving spreading activation through a multidimensional associative network of mental representations, by demonstrating that creating or activating pre-existing links between a single complex non-verbal stimulus and other representations by priming led to an increase in transformations.  相似文献   

16.
The present experiment uses the perceptual adaptation paradigm to establish the validity of a previous test of the feature detector model of speech perception. In the present study, a synthetic stimulus series varied from a CV syllable, [ba], to a nonspeech buzz. When the endpoint tokens were employed alternatively as adaptors, the category boundary was shifted relative to unadapted identification in each adaptor condition. This result suggests that a prior test which used a vowel as the speech endpoint was legitimate because a stop consonant, an exemplary speech sound, was also susceptible to perceptual adaptation in a speech-nonspeech context. Feature detector models predict, incorrectly, that this outcome is impossible. Therefore, this finding may be taken to undermine the interpretation of adaptation as fatigue in a set of detectors tuned to detect the distinctive features of linguistic analysis.  相似文献   

17.
Perceptual changes are experienced during rapid and continuous repetition of a speech form, leading to an auditory illusion known as the verbal transformation effect. Although verbal transformations are considered to reflect mainly the perceptual organization and interpretation of speech, the present study was designed to test whether or not speech production constraints may participate in the emergence of verbal representations. With this goal in mind, we examined whether variations in the articulatory cohesion of repeated nonsense words--specifically, temporal relationships between articulatory events--could lead to perceptual asymmetries in verbal transformations. The first experiment displayed variations in timing relations between two consonantal gestures embedded in various nonsense syllables in a repetitive speech production task. In the second experiment, French participants repeatedly uttered these syllables while searching for verbal transformation. Syllable transformation frequencies followed the temporal clustering between consonantal gestures: The more synchronized the gestures, the more stable and attractive the syllable. In the third experiment, which involved a covert repetition mode, the pattern was maintained without external speech movements. However, when a purely perceptual condition was used in a fourth experiment, the previously observed perceptual asymmetries of verbal transformations disappeared. These experiments demonstrate the existence of an asymmetric bias in the verbal transformation effect linked to articulatory control constraints. The persistence of this effect from an overt to a covert repetition procedure provides evidence that articulatory stability constraints originating from the action system may be involved in auditory imagery. The absence of the asymmetric bias during a purely auditory procedure rules out perceptual mechanisms as a possible explanation of the observed asymmetries.  相似文献   

18.
An experiment is reported which uses a same-different matching paradigm in which subjects are required to indicate whether the consonants of a pair of consonant-diphthong syllables are the same or different. The question addressed is the operation of two hypothesized processes in the perception of speech sounds. The auditory level is shown to hold stimulus information for a brief period of time and be sensitive to allophonic variations within a stimulus. Moreover, matching at this level takes place by identity of the syllables rather than of the separate phoneme segments. The phonemic level is impaired when the diphthong segments of the pair leads to a contradictory match to that of the consonants of the pair, even though only the consonants are relevant to the matching decision.  相似文献   

19.
Apparent changes in auditory scenes are often unnoticed. This change deafness phenomenon was examined in auditory scenes that comprise human voices. In two experiments, listeners were required to detect changes between two auditory scenes comprising two, three, and four talkers who voiced four‐syllable words. One of the voices in the first scene was randomly selected and was replaced with a new word in change trials. The rationale was that higher stimulus familiarity conferred by human voices compared to other everyday sounds, together with encoding and memory advantages for verbal stimuli and the modular processing of speech in auditory processing, should positively influence the change detection efficiency, and the change deafness phenomenon should not be observed when listeners are explicitly required to detect the obvious changes. Contrary to the prediction, change deafness was significantly observed in three‐ and four‐talker conditions. This indicates that change deafness occurs in listeners even for highly familiar stimuli. This suggests the limited ability for perceptual organization of auditory scenes comprising even a relatively small number of voices (three or four).  相似文献   

20.
Integration of simultaneous auditory and visual information about an event can enhance our ability to detect that event. This is particularly evident in the perception of speech, where the articulatory gestures of the speaker's lips and face can significantly improve the listener's detection and identification of the message, especially when that message is presented in a noisy background. Speech is a particularly important example of multisensory integration because of its behavioural relevance to humans and also because brain regions have been identified that appear to be specifically tuned for auditory speech and lip gestures. Previous research has suggested that speech stimuli may have an advantage over other types of auditory stimuli in terms of audio-visual integration. Here, we used a modified adaptive psychophysical staircase approach to compare the influence of congruent visual stimuli (brief movie clips) on the detection of noise-masked auditory speech and non-speech stimuli. We found that congruent visual stimuli significantly improved detection of an auditory stimulus relative to incongruent visual stimuli. This effect, however, was equally apparent for speech and non-speech stimuli. The findings suggest that speech stimuli are not specifically advantaged by audio-visual integration for detection at threshold when compared with other naturalistic sounds.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号