首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Brief experience with reliable spectral characteristics of a listening context can markedly alter perception of subsequent speech sounds, and parallels have been drawn between auditory compensation for listening context and visual color constancy. In order to better evaluate such an analogy, the generality of acoustic context effects for sounds with spectral-temporal compositions distinct from speech was investigated. Listeners identified nonspeech sounds—extensively edited samples produced by a French horn and a tenor saxophone—following either resynthesized speech or a short passage of music. Preceding contexts were “colored” by spectral envelope difference filters, which were created to emphasize differences between French horn and saxophone spectra. Listeners were more likely to report hearing a saxophone when the stimulus followed a context filtered to emphasize spectral characteristics of the French horn, and vice versa. Despite clear changes in apparent acoustic source, the auditory system calibrated to relatively predictable spectral characteristics of filtered context, differentially affecting perception of subsequent target nonspeech sounds. This calibration to listening context and relative indifference to acoustic sources operates much like visual color constancy, for which reliable properties of the spectrum of illumination are factored out of perception of color.  相似文献   

2.
Speech sounds can be classified on the basis of their underlying articulators or on the basis of the acoustic characteristics resulting from particular articulatory positions. Research in speech perception suggests that distinctive features are based on both articulatory and acoustic information. In recent years, neuroelectric and neuromagnetic investigations provided evidence for the brain's early sensitivity to distinctive features and their acoustic consequences, particularly for place of articulation distinctions. Here, we compare English consonants in a Mismatch Field design across two broad and distinct places of articulation - labial and coronal - and provide further evidence that early evoked auditory responses are sensitive to these features. We further add to the findings of asymmetric consonant processing, although we do not find support for coronal underspecification. Labial glides (Experiment 1) and fricatives (Experiment 2) elicited larger Mismatch responses than their coronal counterparts. Interestingly, their M100 dipoles differed along the anterior/posterior dimension in the auditory cortex that has previously been found to spatially reflect place of articulation differences. Our results are discussed with respect to acoustic and articulatory bases of featural speech sound classifications and with respect to a model that maps distinctive phonetic features onto long-term representations of speech sounds.  相似文献   

3.

The nondeterministic relationship between speech acoustics and abstract phonemic representations imposes a challenge for listeners to maintain perceptual constancy despite the highly variable acoustic realization of speech. Talker normalization facilitates speech processing by reducing the degrees of freedom for mapping between encountered speech and phonemic representations. While this process has been proposed to facilitate the perception of ambiguous speech sounds, it is currently unknown whether talker normalization is affected by the degree of potential ambiguity in acoustic-phonemic mapping. We explored the effects of talker normalization on speech processing in a series of speeded classification paradigms, parametrically manipulating the potential for inconsistent acoustic-phonemic relationships across talkers for both consonants and vowels. Listeners identified words with varying potential acoustic-phonemic ambiguity across talkers (e.g., beet/boat vs. boot/boat) spoken by single or mixed talkers. Auditory categorization of words was always slower when listening to mixed talkers compared to a single talker, even when there was no potential acoustic ambiguity between target sounds. Moreover, the processing cost imposed by mixed talkers was greatest when words had the most potential acoustic-phonemic overlap across talkers. Models of acoustic dissimilarity between target speech sounds did not account for the pattern of results. These results suggest (a) that talker normalization incurs the greatest processing cost when disambiguating highly confusable sounds and (b) that talker normalization appears to be an obligatory component of speech perception, taking place even when the acoustic-phonemic relationships across sounds are unambiguous.

  相似文献   

4.
Two talkers' productions of the same phoneme may be quite different acoustically, whereas their productions of different speech sounds may be virtually identical. Despite this lack of invariance in the relationship between the speech signal and linguistic categories, listeners experience phonetic constancy across a wide range of talkers, speaking styles, linguistic contexts, and acoustic environments. The authors present evidence that perceptual sensitivity to talker variability involves an active cognitive mechanism: Listeners expecting to hear 2 different talkers differing only slightly in average pitch showed performance costs typical of adjusting to talker variability, whereas listeners hearing the same materials but expecting a single talker or given no special instructions did not show these performance costs. The authors discuss the implications for understanding phonetic constancy despite variability between talkers (and other sources of variability) and for theories of speech perception. The results provide further evidence for active, controlled processing in real-time speech perception and are consistent with a model of talker normalization that involves contextual tuning.  相似文献   

5.
Speech perception (SP) most commonly refers to the perceptual mapping from the highly variable acoustic speech signal to a linguistic representation, whether it be phonemes, diphones, syllables, or words. This is an example of categorization, in that potentially discriminable speech sounds are assigned to functionally equivalent classes. In this tutorial, we present some of the main challenges to our understanding of the categorization of speech sounds and the conceptualization of SP that has resulted from these challenges. We focus here on issues and experiments that define open research questions relevant to phoneme categorization, arguing that SP is best understood as perceptual categorization, a position that places SP in direct contact with research from other areas of perception and cognition.  相似文献   

6.
For nearly two decades it has been known that infants' perception of speech sounds is affected by native language input during the first year of life. However, definitive evidence of a mechanism to explain these developmental changes in speech perception has remained elusive. The present study provides the first evidence for such a mechanism, showing that the statistical distribution of phonetic variation in the speech signal influences whether 6- and 8-month-old infants discriminate a pair of speech sounds. We familiarized infants with speech sounds from a phonetic continuum, exhibiting either a bimodal or unimodal frequency distribution. During the test phase, only infants in the bimodal condition discriminated tokens from the endpoints of the continuum. These results demonstrate that infants are sensitive to the statistical distribution of speech sounds in the input language, and that this sensitivity influences speech perception.  相似文献   

7.
Listeners perceive speech sounds relative to context. Contextual influences might differ over hemispheres if different types of auditory processing are lateralized. Hemispheric differences in contextual influences on vowel perception were investigated by presenting speech targets and both speech and non-speech contexts to listeners’ right or left ears (contexts and targets either to the same or to opposite ears). Listeners performed a discrimination task. Vowel perception was influenced by acoustic properties of the context signals. The strength of this influence depended on laterality of target presentation, and on the speech/non-speech status of the context signal. We conclude that contrastive contextual influences on vowel perception are stronger when targets are processed predominately by the right hemisphere. In the left hemisphere, contrastive effects are smaller and largely restricted to speech contexts.  相似文献   

8.
Speech perception is an ecologically important example of the highly context-dependent nature of perception; adjacent speech, and even nonspeech, sounds influence how listeners categorize speech. Some theories emphasize linguistic or articulation-based processes in speech-elicited context effects and peripheral (cochlear) auditory perceptual interactions in non-speech-elicited context effects. The present studies challenge this division. Results of three experiments indicate that acoustic histories composed of sine-wave tones drawn from spectral distributions with different mean frequencies robustly affect speech categorization. These context effects were observed even when the acoustic context temporally adjacent to the speech stimulus was held constant and when more than a second of silence or multiple intervening sounds separated the nonlinguistic acoustic context and speech targets. These experiments indicate that speech categorization is sensitive to statistical distributions of spectral information, even if the distributions are composed of nonlinguistic elements. Acoustic context need be neither linguistic nor local to influence speech perception.  相似文献   

9.
Event-related potentials (ERPs) were utilized to study brain activity while subjects listened to speech and nonspeech stimuli. The effect of duplex perception was exploited, in which listeners perceive formant transitions that are isolated as nonspeech "chirps," but perceive formant transitions that are embedded in synthetic syllables as unique linguistic events with no chirp-like sounds heard at all (Mattingly et al., 1971). Brain ERPs were recorded while subjects listened to and silently identified plain speech-only tokens, duplex tokens, and tone glides (perceived as "chirps" by listeners). A highly controlled set of stimuli was developed that represented equivalent speech and nonspeech stimulus tokens such that the differences were limited to a single acoustic parameter: amplitude. The acoustic elements were matched in terms of number and frequency of components. Results indicated that the neural activity in response to the stimuli was different for different stimulus types. Duplex tokens had significantly longer latencies than the pure speech tokens. The data are consistent with the contention of separate modules for phonetic and auditory stimuli.  相似文献   

10.
Language experience clearly affects the perception of speech, but little is known about whether these differences in perception extend to non‐speech sounds. In this study, we investigated rhythmic perception of non‐linguistic sounds in speakers of French and German using a grouping task, in which complexity (variability in sounds, presence of pauses) was manipulated. In this task, participants grouped sequences of auditory chimeras formed from musical instruments. These chimeras mimic the complexity of speech without being speech. We found that, while showing the same overall grouping preferences, the German speakers showed stronger biases than the French speakers in grouping complex sequences. Sound variability reduced all participants’ biases, resulting in the French group showing no grouping preference for the most variable sequences, though this reduction was attenuated by musical experience. In sum, this study demonstrates that linguistic experience, musical experience, and complexity affect rhythmic grouping of non‐linguistic sounds and suggests that experience with acoustic cues in a meaningful context (language or music) is necessary for developing a robust grouping preference that survives acoustic variability.  相似文献   

11.
ABSTRACT

Carol Fowler has had a tremendous impact on the field of speech perception, in part by having people disagree with her. The disagreements arise, as they often do, from 2 incompatible sources: her positions are often misunderstood and thus “disagreed” with only on the surface, and her positions are rejected because they challenge deeply held, intuitively appealing positions without being shown to be wrong. The misunderstandings center largely on the assertion that perception is “direct.” This is often taken to mean that we have access to the speaker's vocal tract by some means other than the (largely acoustic) speech signal, when, in fact, it asserts that the signal is sufficient to directly specify that production. It is unclear why this misunderstanding persists; although there are still issues to be resolved in this regard, the stance is clear. The challenge to “acoustic” theories of speech perception remains, and thus direct perception is still controversial, as it seems that acoustic theories are held by a majority of researchers. Decades' worth of evidence showing the lack of usefulness of purely acoustic properties and the coherence gained by a production perspective have not changed this situation. Some attempts at combining the 2 perspectives have emerged, but they largely miss the Gibsonian challenge that Fowler has espoused: perception of speech is direct. It looks as though it will take some further decades of research and discussion to fully explore her position.  相似文献   

12.
A procedure based on monaural fusion has been developed to construct acoustic continua between natural speech sounds, to be used in studies of speech perception. Two speech stimuli of similar temporal structure and different spectral composition are precisely aligned in time and presented simultaneously to the listener. By mixing both stimulus components in varying intensity ratios, a transition from one component to the other can be achieved. Such stimulus continua have several advantages over the synthetic continua commonly used in studies of categorical perception and related phenomena: They are based on real speech stimuli; the endpoint stimuli are unambiguous; and the stimuli are characterized by a well-defined physical variable, the relative intensity of the two components.  相似文献   

13.
The language environment modifies the speech perception abilities found in early development. In particular, adults have difficulty perceiving many nonnative contrasts that young infants discriminate. The underlying perceptual reorganization apparently occurs by 10-12 months. According to one view, it depends on experiential effects on psychoacoustic mechanisms. Alternatively, phonological development has been held responsible, with perception influenced by whether the nonnative sounds occur allophonically in the native language. We hypothesized that a phonemic process appears around 10-12 months that assimilates speech sounds to native categories whenever possible; otherwise, they are perceived in auditory or phonetic (articulatory) terms. We tested this with English-speaking listeners by using Zulu click contrasts. Adults discriminated the click contrasts; performance on the most difficult (80% correct) was not diminished even when the most obvious acoustic difference was eliminated. Infants showed good discrimination of the acoustically modified contrast even by 12-14 months. Together with earlier reports of developmental change in perception of nonnative contrasts, these findings support a phonological explanation of language-specific reorganization in speech perception.  相似文献   

14.
It is not unusual to find it stated as a fact that the left hemisphere is specialized for the processing of rapid, or temporal aspects of sound, and that the dominance of the left hemisphere in the perception of speech can be a consequence of this specialization. In this review we explore the history of this claim and assess the weight of this assumption. We will demonstrate that instead of a supposed sensitivity of the left temporal lobe for the acoustic properties of speech, it is the right temporal lobe which shows a marked preference for certain properties of sounds, for example longer durations, or variations in pitch. We finish by outlining some alternative factors that contribute to the left lateralization of speech perception.  相似文献   

15.
Earlier experiments have shown that when one or more speech sounds in a sentence are replaced by a noise meeting certain criteria, the listener mislocalizes the extraneous sound and believes he hears the missing phoneme(s) clearly. The present study confirms and extends these earlier reports of phonemic restorations under a variety of novel conditions. All stimuli had some of the context necessary for the appropriate phonemic restoration following the missing sound, and all sentences had the missing phoneme deliberately mispronounced before electronic deletion (so that the neighboring phonemes could not provide acoustic cues to aid phonemic restorations). The results are interpreted in terms of mechanisms normally aiding veridical perception of speech and nonspeech sounds.  相似文献   

16.
Lip reading is the ability to partially understand speech by looking at the speaker's lips. It improves the intelligibility of speech in noise when audio-visual perception is compared with audio-only perception. A recent set of experiments showed that seeing the speaker's lips also enhances sensitivity to acoustic information, decreasing the auditory detection threshold of speech embedded in noise [J. Acoust. Soc. Am. 109 (2001) 2272; J. Acoust. Soc. Am. 108 (2000) 1197]. However, detection is different from comprehension, and it remains to be seen whether improved sensitivity also results in an intelligibility gain in audio-visual speech perception. In this work, we use an original paradigm to show that seeing the speaker's lips enables the listener to hear better and hence to understand better. The audio-visual stimuli used here could not be differentiated by lip reading per se since they contained exactly the same lip gesture matched with different compatible speech sounds. Nevertheless, the noise-masked stimuli were more intelligible in the audio-visual condition than in the audio-only condition due to the contribution of visual information to the extraction of acoustic cues. Replacing the lip gesture by a non-speech visual input with exactly the same time course, providing the same temporal cues for extraction, removed the intelligibility benefit. This early contribution to audio-visual speech identification is discussed in relationships with recent neurophysiological data on audio-visual perception.  相似文献   

17.
Developmental language learning impairments affect 10 to 20% of children and increase their risk of later literacy problems (dyslexia) and psychiatric disorders. Both oral- and written-language impairments have been linked to slow neural processing, which is hypothesized to interfere with the perception of speech sounds that are characterized by rapid acoustic changes. Research into the etiology of language learning impairments not only has led to improved diagnostic and intervention strategies, but also has raised fundamental questions about the neurobiological basis of speech, language, and reading, as well as hemispheric lateralization.  相似文献   

18.
Perceptual discrimination between speech sounds belonging to different phoneme categories is better than that between sounds falling within the same category. This property, known as "categorical perception," is weaker in children affected by dyslexia. Categorical perception develops from the predispositions of newborns for discriminating all potential phoneme categories in the world's languages. Predispositions that are not relevant for phoneme perception in the ambient language are usually deactivated during early childhood. However, the current study shows that dyslexic children maintain a higher sensitivity to phonemic distinctions irrelevant in their linguistic environment. This suggests that dyslexic children use an allophonic mode of speech perception that, although without straightforward consequences for oral communication, has obvious implications for the acquisition of alphabetic writing. Allophonic perception specifically affects the mapping between graphemes and phonemes, contrary to other manifestations of dyslexia, and may be a core deficit.  相似文献   

19.
20.
In this study, we attempted to determine whether phonetic disintegration of speech in Broca's aphasia affects the spectral characteristics of speech sounds as has been shown for the temporal characteristics of speech. To this end, we investigated the production of place of articulation in Broca's aphasics. Acoustic analysis of the spectral characteristics for stop consonants were conducted. Results indicated that the static aspects of speech production were preserved, as Broca's aphasics seemed to be able to reach the articulatory configuration for the appropriate place of articulation. However, the dynamic aspects of speech production seemed to be impaired, as their productions reflected problems with the source characteristics of speech sounds and with the integration of articulatory movements in the vocal tract. Listener perceptions of the aphasics' productions were compared with acoustic analyses for these same productions. The two measures were related; that is, the spectral characteristics of the utterances provided salient cues for place of articulation perception. An analysis of the occurrences of errors along the dimensions of voicing and place showed that aphasics rarely produce utterances containing both voice and place substitutions.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号