首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Prevocalic and postvocalic (unreleased) occurrences of a stop consonant differ in acoustic shape, but are not unrelated. In particular, the formant transitions taking place at release of a stop consonant approximately mirror in time the formant transitions occurring during closure, assuming that the vowel is the ~same. Several experiments have been performed using brief two-component tone burst approximations to the second and third formant transitions’ that occur in prevocalic and postvocalic allophones of/b, d, g/in order to determine whether such mirror-image acoustic patterns are perceptually related. Listener judgments of similarity within triads of these stimuli indicate that mirror-image patterrts representing the same place of articulation are less similar to each other than to patterns representing different places of articulation. Implications for the child who is acquiring language of the fact that mirror-image patterns in speech do not have inherent perceptual similarity are discussed.  相似文献   

2.
Selective adaptation with a syllable-initial consonant fails to affect perception of the same consonant in syllable-final position, and vice versa. One account of this well-replicated result invokes a cancellation explanation: with the place-of-articulation stimuli used, the pattern of formant transitions switches according to syllabic position, allowing putative phonetic-level effects to be opposed by putative acoustic-level effects. Three experiments tested the cancellation hypothesis by preempting the possibility of acoustic countereffects. In Experiment 1, the test syllables and adaptors were /r/-/l/CVs and VCs, which do not produce cancelling formant patterns across syllabic position. In Experiment 2, /b/-/d/ continua were used in a paired-contrast procedure, believed to be sensitive to phonetic, but not acoustic, identity. In Experiment 3, cross-ear adaptation, also believed to tap phonetic rather than acoustic processes, was used. All three experiments refuted the cancellation hypothesis. Instead, it appears that the perceptual process treats syllable-initial consonants and syllable-final ones as inherently different. These results provide support for the use of demisyllabic representations in speech perception.  相似文献   

3.
刘文理  祁志强 《心理科学》2016,39(2):291-298
采用启动范式,在两个实验中分别考察了辅音范畴和元音范畴知觉中的启动效应。启动音是纯音和目标范畴本身,目标音是辅音范畴和元音范畴连续体。结果发现辅音范畴连续体知觉的范畴反应百分比受到纯音和言语启动音影响,辅音范畴知觉的反应时只受言语启动音影响;元音范畴连续体知觉的范畴反应百分比不受两种启动音影响,但元音范畴知觉的反应时受到言语启动音影响。实验结果表明辅音范畴和元音范畴知觉中的启动效应存在差异,这为辅音和元音范畴内在加工机制的差异提供了新证据。  相似文献   

4.
We examined the perceptual weighting by children and adults of the acoustic properties specifying complete closure of the vocal tract following a syllable-initial [s]. Experiment 1 was a novel manipulation of previously examined acoustic properties (duration of a silent gap and first formant transition) and showed that children weight the first formant transition more than adults. Experiment 2, an acoustic analysis of naturally producedsay andstay, revealed that, contrary to expectations, a burst can be present instay and that first formant transitions do not necessarily distinguishsay andstay in natural tokens. Experiment 3 manipulated natural speech portions to create stimuli that varied primarily in the duration of the silent gap and in the presence or absence of a stop burst, and showed that children weight these stop bursts less than adults. Taken together, the perception experiments support claims that children integrate multiple acoustic properties as adults do, but that they weight dynamic properties of the signal more than adults and weight static properties less.  相似文献   

5.
An experiment was conducted which assessed the relative contributions of three acoustic cues to the distinction between stop consonant and semivowel in syllable initial position. Subjects identified three series of syllables which varied perceptually from [ba] to [wa]. The stimuli differed only in the extent, duration, and rate of the second formant transition. In each series, one of the variables remained constant while the other two changed. Obtained identification ratings were plotted as a function of each variable. The results indicated that second formant transition duration and extent contribute significantly to perception. Short second formant transition extents and durations signal stops, while long second formant transition extents and durations signal semivowels. It was found that second formant transition rate did not contribute significantly to this distinction. Any particular rate could signal either a stop or semivowel. These results are interpreted as arguing against models that incorporate transition rate as a cue to phonetic distinctions. In addition, these results are related to a previous selective adaptation experiment. It is shown that the “phonetic” interpretation of the obtained adaptation results was not justified.  相似文献   

6.
How does the brain extract invariant properties of variable-rate speech? A neural model, called PHONET, is developed to explain aspects of this process and, along the way, data about perceptual context effects. For example, in consonant-vowel (CV) syllables, such as /ba/ and /wa/, an increase in the duration of the vowel can cause a switch in the percept of the preceding consonant from /w/ to /b/ (J.L. Miller & Liberman, 1979). The frequency extent of the initial formant transitions of fixed duration also influences the percept (Schwab, Sawusch, & Nusbaum, 1981). PHONET quantitatively simulates over 98% of the variance in these data, using a single set of parameters. The model also qualitatively explains many data about other perceptual context effects. In the model, C and V inputs are filtered by parallel auditory streams that respond preferentially to the transient and sustained properties of the acoustic signal before being stored in parallel working memories. A lateral inhibitory network of onset- and rate-sensitive cells in the transient channel extracts measures of frequency transition rate and extent. Greater activation of the transient stream can increase the processing rate in the sustained stream via a cross-stream automatic gain control interaction. The stored activities across these gain-controlled working memories provide a basis for rate-invariant perception, since the transient-to-sustained gain control tends to preserve the relative activities across the transient and sustained working memories as speech rate changes. Comparisons with alternative models tested suggest that the fit cannot be attributed to the simplicity of the data. Brain analogues of model cell types are described.  相似文献   

7.
Speech perception can be viewed in terms of the listener’s integration of two sources of information: the acoustic features transduced by the auditory receptor system and the context of the linguistic message. The present research asked how these sources were evaluated and integrated in the identification of synthetic speech. A speech continuum between the glide-vowel syllables /ri/ and /li/ was generated by varying the onset frequency of the third formant. Each sound along the continuum was placed in a consonant-cluster vowel syllable after an initial consonant /p/, /t/, /s/, and /v/. In English, both /r/ and /l/ are phonologically admissible following /p/ but are not admissible following /v/. Only /l/ is admissible following /s/ and only /r/ is admissible following /t/. A third experiment used synthetic consonant-cluster vowel syllables in which the first consonant varied between /b/ and /d and the second consonant varied between /l/ and /r/. Identification of synthetic speech varying in both acoustic featural information and phonological context allowed quantitative tests of various models of how these two sources of information are evaluated and integrated in speech perception.  相似文献   

8.
The basis for the invariant perception of place of articulation in pre- and postvocalic stops was investigated using the selective adaptation paradigm. Experiments 1 and 2 considered the role of identical bursts, mirror-image formant transitions, and similar onset and offset spectra in the invariant perception of place of articulation in CV and VC stimuli, and Experiment 3 considered the importance of the second two cues in a VCV context. The results of these experiments suggest that, at the level of processing tapped by selective adaptation, neither identical bursts, mirror-image formant transitions, nor similar onset and offset spectra are the basis for the invariant perception of place of articulation in initial and final position. The vowel portion of an adapter was found to affect perception of the consonant portion of a stimulus, and the direction of this effect was predictable from the acoustic characteristics of the consonant and vowel. The implications of these findings for the nature of selective adaptation are discussed.  相似文献   

9.
Learning a second language as an adult is particularly effortful when new phonetic representations must be formed. Therefore the processes that allow learning of speech sounds are of great theoretical and practical interest. Here we examined whether perception of single formant transitions, that is, sound components critical in speech perception, can be enhanced through an implicit task-irrelevant learning procedure that has been shown to produce visual perceptual learning. The single-formant sounds were paired at subthreshold levels with the attended targets in an auditory identification task. Results showed that task-irrelevant learning occurred for the unattended stimuli. Surprisingly, the magnitude of this learning effect was similar to that following explicit training on auditory formant transition detection using discriminable stimuli in an adaptive procedure, whereas explicit training on the subthreshold stimuli produced no learning. These results suggest that in adults learning of speech parts can occur at least partially through implicit mechanisms.  相似文献   

10.
Sussman HM  Fruchter D  Hilbert J  Sirosh J 《The Behavioral and brain sciences》1998,21(2):241-59; discussion 260-99
Neuroethological investigations of mammalian and avian auditory systems have documented species-specific specializations for processing complex acoustic signals that could, if viewed in abstract terms, have an intriguing and striking relevance for human speech sound categorization and representation. Each species forms biologically relevant categories based on combinatorial analysis of information-bearing parameters within the complex input signal. This target article uses known neural models from the mustached bat and barn owl to develop, by analogy, a conceptualization of human processing of consonant plus vowel sequences that offers a partial solution to the noninvariance dilemma--the nontransparent relationship between the acoustic waveform and the phonetic segment. Critical input sound parameters used to establish species-specific categories in the mustached bat and barn owl exhibit high correlation and linearity due to physical laws. A cue long known to be relevant to the perception of stop place of articulation is the second formant (F2) transition. This article describes an empirical phenomenon--the locus equations--that describes the relationship between the F2 of a vowel and the F2 measured at the onset of a consonant-vowel (CV) transition. These variables, F2 onset and F2 vowel within a given place category, are consistently and robustly linearly correlated across diverse speakers and languages, and even under perturbation conditions as imposed by bite blocks. A functional role for this category-level extreme correlation and linearity (the "orderly output constraint") is hypothesized based on the notion of an evolutionarily conserved auditory-processing strategy. High correlation and linearity between critical parameters in the speech signal that help to cue place of articulation categories might have evolved to satisfy a preadaptation by mammalian auditory systems for representing tightly correlated, linearly related components of acoustic signals.  相似文献   

11.
How do acoustic attributes of the speech signal contribute to feature-processing interactions that occur in phonetic classification? In a series of five experiments addressed to this question, listeners performed speeded classification tasks that explicitly required a phonetic decision for each response. Stimuli were natural consonant-vowel syllables differing by multiple phonetic features, although classification responses were based on a single target feature. In control tasks, no variations in nontarget features occurred, whereas in orthogonal tasks nonrelevant feature variations occurred but had to be ignored. Comparison of classification times demonstrated that feature information may either be processed separately as independent cues for each feature or as a single integral segment that jointly specifies several features. The observed form on processing depended on the acoustic manifestations of feature variation in the signal. Stop-consonant place of articulation and voicing cues, conveyed independently by the pattern and excitation source of the initial formant transitions, may be processed separately. However, information for consonant place of articulation and vowel quality, features that interactively affect the shape of initial formant transitions, are processed as an integral segment. Articulatory correlates of each type of processing are discussed in terms of the distinction between source features that vary discretely in speech production and resonance features that can change smoothly and continuously. Implications for perceptual models that include initial segmentation of an input utterance into a phonetic feature representation are also considered.  相似文献   

12.
Three selective adaptation experiments were run, using nonspeech stimuli (music and noise) to adapt speech continua ([ba]-[wa] and [cha]-[sha]). The adaptors caused significant phoneme boundary shifts on the speech continua only when they matched in periodicity: Music stimuli adapted [ba]-[wa], whereas noise stimuli adapted [cha]-[sha]. However, such effects occurred even when the adaptors and test continua did not match in other simple acoustic cues (rise time or consonant duration). Spectral overlap of adaptors and test items was also found to be unnecessary for adaptation. The data support the existence of auditory processors sensitive to complex acoustic cues, as well as units that respond to more abstract properties. The latter are probably at a level previously thought to be phonetic. Asymmetrical adaptation was observed, arguing against an opponent-process arrangement of these units. A two-level acoustic model of the speech perception process is offered to account for the data.  相似文献   

13.
In five experiments with synthetic and natural speech syllables, a rating task we used to study the effects of differences in vowels, consonants, and segment order on judged syllable similarity. The results of Experiments I-IV support neither a purely phonemic model of speech representation, in which vowel, consonant, and order are represented independently, nor a purely syllabic model, in which the three factors are integrated. Instead, the data indicate that subjects compare representations in which adjacent vowel and consonant are independent of one another but are not independent of their positions in the syllable. Experiment V provided no support for the hypothesis that this position-sensitive coding is due to acoustic differences in formant transitions.  相似文献   

14.
Event-related potentials (ERPs) were utilized to study brain activity while subjects listened to speech and nonspeech stimuli. The effect of duplex perception was exploited, in which listeners perceive formant transitions that are isolated as nonspeech "chirps," but perceive formant transitions that are embedded in synthetic syllables as unique linguistic events with no chirp-like sounds heard at all (Mattingly et al., 1971). Brain ERPs were recorded while subjects listened to and silently identified plain speech-only tokens, duplex tokens, and tone glides (perceived as "chirps" by listeners). A highly controlled set of stimuli was developed that represented equivalent speech and nonspeech stimulus tokens such that the differences were limited to a single acoustic parameter: amplitude. The acoustic elements were matched in terms of number and frequency of components. Results indicated that the neural activity in response to the stimuli was different for different stimulus types. Duplex tokens had significantly longer latencies than the pure speech tokens. The data are consistent with the contention of separate modules for phonetic and auditory stimuli.  相似文献   

15.
These studies examined the perceptual role of various components of naturally produced stop consonants (/b, d, g, p, t, k/) in CV syllables. In the first experiment, the context-sensitive voiced formant transitions were removed with a computer-splicing technique. Identification accuracy was 84% when the consonant was presented with the same vowel as had been used to produce it. Performance fell to 66% when the consonant was juxtaposed with a different vowel. The second experiment not only deleted the voiced formant transition, but also replaced the aspiration with silence. Here, identification accuracy dropped substantially, especially for voiceless stops, which had contained devoiced formant transitions in the replaced interval. The pattern of errors suggested that listeners try to extract the missing locus of the consonant from the vowel transition, and in the absence of a vowel transition, they try to extrapolate it from the second formant of the steady-state vowel.  相似文献   

16.
Across languages, children with developmental dyslexia have a specific difficulty with the neural representation of the sound structure (phonological structure) of speech. One likely cause of their difficulties with phonology is a perceptual difficulty in auditory temporal processing (Tallal, 1980). Tallal (1980) proposed that basic auditory processing of brief, rapidly successive acoustic changes is compromised in dyslexia, thereby affecting phonetic discrimination (e.g. discriminating /b/ from /d/) via impaired discrimination of formant transitions (rapid acoustic changes in frequency and intensity). However, an alternative auditory temporal hypothesis is that the basic auditory processing of the slower amplitude modulation cues in speech is compromised (Goswami et al., 2002). Here, we contrast children's perception of a synthetic speech contrast (ba/wa) when it is based on the speed of the rate of change of frequency information (formant transition duration) versus the speed of the rate of change of amplitude modulation (rise time). We show that children with dyslexia have excellent phonetic discrimination based on formant transition duration, but poor phonetic discrimination based on envelope cues. The results explain why phonetic discrimination may be allophonic in developmental dyslexia (Serniclaes et al., 2004), and suggest new avenues for the remediation of developmental dyslexia.  相似文献   

17.
We investigated the conditions under which the [b]-[w] contrast is processed in a contextdependent manner, specifically in relation to syllable duration. In an earlier paper, Miller and Liberman (1979) demonstrated that when listeners use transition duration to differentiate [b] from [w], they treat it in relation to the duration of the syllable: As syllables from a [ba]-[wa] series varying in transition duration become longer, so, too, does the transition duration at the [b]-[w] perceptual boundary. In a subsequent paper, Shinn, Blumstein, and Jongman (1985) questioned the generality of this finding by showing that the effect of syllable duration is eliminated for [ba]-[wa] stimuli that are less schematicthan those used by Miller and Liberman. In the present investigation, we demonstrated that when these “more natural” stimuli are presented in a multitalker babble noise instead of in quiet (as was done by Shinn et al.), the syllable-duration effect emerges. Our findings suggest that the syllable-duration effect in particular, and context effects in general, may play a more important role in speech perception than Shinn et al. suggested.  相似文献   

18.
In order to function effectively as a means of communication, speech must be intelligible under the noisy conditions encountered in everyday life. Two types of perceptual synthesis have been reported that can reduce or cancel the effects of masking by extraneous sounds: Phonemic restoration can enhance intelligibility when segments are replaced or masked by noise, and contralateral induction can prevent mislateralization by effectively restoring speech masked at one ear when it is heard in the other. The present study reports a third type of perceptual synthesis induced by noise: enhancement of intelligibility produced by adding noise to spectral gaps. In most of the experiments, the speech stimuli consisted of two widely separated narrow bands of speech (center frequencies of 370 and 6000 Hz, each band having high-pass and low-pass slopes of 115 dB/octave meeting at the center frequency). These very narrow bands effectively reduced the available information to frequency-limited patterns of amplitude fluctuation lacking information concerning formant structure and frequency transitions. When stochastic noise was introduced into the gap separating the two speech bands, intelligibility increased for “everyday” sentences, for sentences that varied in the transitional probability of keywords, and for monosyllabic word lists. Effects produced by systematically varying noise amplitude and noise bandwidth are reported, and the implications of some of the novel effects observed are discussed.  相似文献   

19.
Acoustic cues for the perception of place of articulation in aphasia   总被引:1,自引:0,他引:1  
Two experiments assessed the abilities of aphasic patients and nonaphasic controls to perceive place of articulation in stop consonants. Experiment I explored labeling and discrimination of [ba, da, ga] continua varying in formant transitions with or without an appropriate burst onset appended to the transitions. Results showed general difficulty in perceiving place of articulation for the aphasic patients. Regardless of diagnostic category or auditory language comprehension score, discrimination ability was independent of labeling ability, and discrimination functions were similar to normals even in the context of failure to reliably label the stimuli. Further there was less variability in performance for stimuli with bursts than without bursts. Experiment II measured the effects of lengthening the formant transitions on perception of place of articulation in stop consonants and on the perception of auditory analogs to the speech stimuli. Lengthening the transitions failed to improve performance for either the speech or nonspeech stimuli, and in some cases, reduced performance level. No correlation was observed between the patient's ability to perceive the speech and nonspeech stimuli.  相似文献   

20.
Morton, Marcus, and Frankish (1976) defined “perceptual center,” or “P-center,” as a neutral term to describe that which is regular in a perceptually regular sequence of speech sounds. This paper describes a paradigm for the determination of P-center location and the effect of various acoustic parameters on empirically determined P-center locations. It is shown that P-center location is affected by both initial consonant duration and, secondarily, subsequent vowel and consonant duration. A simple two-parameter model involving the duration of the whole stimulus is developed and gives good performance in predicting P-center location. The application of this model to continuous speech is demonstrated. It is suggested that there is little value in attempting to determine any single acoustic or articulatory correlate of P-center location, or in attempting to define P-center location absolutely in time. Rather, these results indicate that P-centers are a property of the whole stimulus and reflect properties of both the production and perception of speech.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号