首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
采用语境效应范式,以汉语听者为被试,在三个实验中考察了塞辅音声学信息和语音信息激活的时间进程。实验1语境刺激是/ta/、/ka/音节和/ta/、/ka/塞音段的声学模拟音,目标刺激是/ta/-/ka/对比连续体,结果发现,塞音声学信息激活没有产生语境效应。实验2语境刺激是/ta/、/ka/音节和/ta/、/ka/塞音段,结果发现塞音语音信息激活产生了显著的对比语境效应。实验3变化塞音段和目标刺激之间的间隔,系统考察塞音范畴通达的时间进程,结果发现,塞音知觉中听觉加工阶段向语音加工阶段的转变约发生于刺激加工后120 ms。实验结果揭示了塞辅音知觉中音位范畴通达的时间进程。  相似文献   

2.
Despite many attempts to define the major unit of speech perception, none has been generally accepted. In a unique study, Mermelstein (1978) claimed that consonants and vowels are the appropriate units because a single piece of information (duration, in this case) can be used for one distinction without affecting the other. In a replication, this apparent independence was found, instead, to reflect a lack of statistical power: The vowel and consonant judgments did interact. In another experiment, interdependence of two phonetic judgments was found in responses based on the fricative noise and the vocalic formants of a fricative-vowel syllable. These results show that each judgment made on speech signals must take into account other judgments that compete for information in the same signal. An account is proposed that takes segments as the primary units, with syllables imposing constraints on the shape they may take.  相似文献   

3.
It is well known that the formant transitions of stop consonants in CV and VC syllables are roughly the mirror image of each other in time. These formant motions reflect the acoustic correlates of the articulators as they move rapidly into and out of the period of stop closure. Although acoustically different, these formant transitions are correlated perceptually with similar phonetic segments. Earlier research of Klatt and Shattuck (1975) had suggested that mirror image acoustic patterns resembling formant transitions were not perceived as similar. However, mirror image patterns could still have some underlying similarity which might facilitate learning, recognition, and the establishment of perceptual constancy of phonetic segments across syllable positions. This paper reports the results of four experiments designed to study the perceptual similarity of mirror-image acoustic patterns resembling the formant transitions and steady-state segments of the CV and VC syllables /ba/, /da/, /ab/, and /ad/. Using a perceptual learning paradigm, we found that subjects could learn to assign mirror-image acoustic patterns to arbitrary response categories more consistently than they could do so with similar arrangements of the same patterns based on spectrotemporal commonalities. Subjects respond not only to the individual components or dimensions of these acoustic patterns, but also process entire patterns and make use of the patterns’ internal organization in learning to categorize them consistently according to different classification rules.  相似文献   

4.
In the McGurk effect, perception of audiovisually discrepant syllables can depend on auditory, visual, or a combination of audiovisual information. Under some conditions, visual information can override auditory information to the extent that identification judgments of a visually influenced syllable can be as consistent as for an analogous audiovisually compatible syllable. This might indicate that visually influenced and analogous audiovisually compatible syllables are phonetically equivalent. Experiments were designed to test this issue using a compelling visually influenced syllable in an AXB matching paradigm. Subjects were asked to match an audio syllable/va/either to an audiovisually consistent syllable (audio/va/-video/fa/) or an audiovisually discrepant syllable (audio/ba/-video/fa/). It was hypothesized that if the two audiovisual syllables were phonetically equivalent, then subjects should choose them equally often in the matching task. Results show, however, that subjects are more likely to match the audio/va/ to the audiovisually consistent/va/, suggesting differences in phonetic convincingness. Additional experiments further suggest that this preference is not based on a phonetically extraneous dimension or on noticeable relative audiovisual discrepancies.  相似文献   

5.
In the McGurk effect, perception of audiovisually discrepant syllables can depend on auditory, visual, or a combination of audiovisual information. Undersome conditions, Vi8Ual information can override auditory information to the extent that identification judgments of a-visually influenced syllable can be as consistent as for an analogous audiovisually compatible syllable. This might indicate that visually influenced and analogous audiuvisually-compatible syllables-are-phictnetically equivalent. Experiments were designed to test this issue using a compelling visually influenced syllable in an AXB matching paradigm. Subjects were asked tomatch an audio syllable /val either to an audiovisually consistent syllable (audio /val-video /fa/) or an audiovisually discrepant syllable (audio /bs/-video ifa!). It was hypothesized that if the two audiovisual syllables were phonetically equivalent, then subjects should choose them equally often in the matching task. Results show, however, that subjects are more likely to match the audio /va/ to the audiovisually consistent /va/, suggesting differences in phonetic convincingness. Additional experiments further suggest that this preference is not based on a phonetically extraneous dimension or on noticeable relative audiovisual discrepancies.  相似文献   

6.
It has been demonstrated using the "silent-center" (SC) syllable paradigm that there is sufficient information in syllable onsets and offsets, taken together, to support accurate identification of vowels spoken in both citation-form syllables and syllables spoken in sentence context. Using edited natural speech stimuli, the present study examined the identification of American English vowels when increasing amounts of syllable onsets alone or syllable offsets alone were presented in their original sentence context. The stimuli were /d/-vowel-/d/ syllables spoken in a short carrier sentence by a male speaker. Listeners attempted to identify the vowels in experimental conditions that differed in the number of pitch periods presented and whether the pitch periods were from syllable onsets or syllable offsets. In general, syllable onsets were more informative than syllable offsets, although neither onsets nor offsets alone specified vowel identity as well as onsets and offsets together (SC syllables). Vowels differed widely in ease of identification; the diphthongized long vowels /e/, /ae/, /o/ were especially difficult to identify from syllable offsets. Identification of vowels as "front" or "back" was accurate, even from short samples of the syllable; however, vowel "height" was quite difficult to determine, again, especially from syllable offsets. The results emphasize the perceptual importance of time-varying acoustic parameters, which are the direct consequence of the articulatory dynamics involved in producing syllables.  相似文献   

7.
Subjects monitored for the syllable-initial phonemes /b/ and /s/, as well as for the syllables containing those phonemes, in lists of nonsense syllables. Time to detect /b/ was a function of the amount of uncertainty as to the identity of the vowel following the target consonant; when uncertainty was low, no difference existed between phoneme and syllable monitoring latencies, but when uncertainty was high, syllables were detected faster than phonemes. Time to detect /s/ was independent of uncertainty concerning the accompanying vowel and was always slower than syllable detection. The role of knowledge of contexts in a phoneme-monitoring task as well as the relative availability of phonemic information to the listener in this task are discussed.  相似文献   

8.
It has been demonstrated using the “silent-center” (SC) syllable paradigm that there is sufficient information in syllable onsets and offsets,taken together, to support accurate identification of vowels spoken in both citation-form syllables and syllables spoken in sentence context. Using edited natural speech stimuli, the present study examined the identification of American English vowels when increasing amounts of syllable onsetsalone or syllable offsetsalone were presented in their original sentence context. The stimuli were /d/-vowel-/d/ syllables spoken in a short carrier sentence by a male speaker. Listeners attempted to identify the vowels in experimental conditions that differed in the number of pitch periods presented and whether the pitch periods were from syllable onsets or syllable off-sets. In general, syllable onsets were more informative than syllable offsets, although neither onsets nor offsets alone specified vowel identity as well as onsets and offsets together (SC syllables). Vowels differed widely in ease of identification; the diphthongized long vowels /e/, /ae/, /o/ were especially difficult to identify from syllable offsets. Identification of vowels as “front” or “back” was accurate, even from short samples of the syllable; however, vowel "height" was quite difficult to determine, again, especially from syllable offsets. The results emphasize the perceptual importance of time-varying acoustic parameters, which are the direct consequence of the articulatory dynamics involved in producing syllables.  相似文献   

9.
This study investigated the acoustic correlates of perceptual centers (p-centers) in CV and VC syllables and developed an acoustic p-center model. In Part 1, listeners located syllables’ p-centers by a method-of-adjustment procedure. The CV syllables contained the consonants /?/, /r/, /n /, /t/, /d/, /k/, and /g/; the VCs, the consonants /?/, /r/, and /n/. The vowel in all syllables was /a/. The results of this experiment replicated and extended previous findings regarding the effects of phonetic variation on p-centers. In Part 2, a digital signal processing procedure was used to acoustically model p-center perception. Each stimulus was passed through a six-band digital filter, and the outputs were processed to derive low-frequency modulation components. These components were weighted according to a perceived modulation magnitude function and recombined to create sixpsychoacoustic envelopes containing modulation energies from 3 to 47 Hz. In this analysis, p-centers were found to be highly correlated with the time-weighted function of the rate-of-change in the psychoacoustic envelopes, multiplied by the psychoacoustic envelope magnitude increment. The results were interpreted as suggesting (1) the probable role of low-frequency energy modulations in p-center perception, and (2) the presence of perceptual processes that integrate multiple articulatory events into a single syllabic event.  相似文献   

10.
In three experiments, we determined how perception of the syllable-initial distinction between the stop consonant [b] and the semivowel [w], when cued by duration of formant transitions, is affected by parts of the sound pattern that occur later in time. For the first experiment, we constructed four series of syllables, similar in that each had initial formant transitions ranging from one short enough for [ba] to one long enough for [wa], hut different in overall syllable duration. The consequence in perception was that, as syllable duration increased, the [b-w] boundary moved toward transitions of longer duration. Then, in the second experiment, we increased the duration of the sound by adding a second syllable, [da], (thus creating [bada-wada]), and observed that lengthening the second syllable also shifted the perceived [b-w] boundary in the first syllable toward transitions of longer duration; however, this effect was small by comparison with that produced when the first syllable was lengthened equivalently. In the third experiment, we found that altering the structure of the syllable had an effect that is not to be accounted for by the concomitant change in syllable duration: lengthening the syllable by adding syllable-final transitions appropriate for the stop consonant [d] (thus creating [bad-wad]) caused the perceived [b-w] boundary to shift toward transitions of shorter duration, an effect precisely opposite to that produced when the syllable was lengthened to the same extent by adding steady-state vowel. We suggest that, in all these cases, the later-occurring information specifies rate of articulation and that the effect on the earlier-occurring cue reflects an appropriate perceptual normalization.  相似文献   

11.
Discrimination of speech sounds from three computer-generated continua that ranged from voiced to voiceless syllables (/ba-pa/, /da-ta/, and ga-ha/ was tested with three macaques. The stimuli on each continuum varied in voice-onset time (VOT). Paris of stimuli that were equally different in VOT were chosen such that they were either within-category pairs (syllables given the same phonetic label by human listeners) or between-category paks (syllables given different phonetic labels by human listeners). Results demonstrated that discrimination performance was always best for between-category pairs of stimuli, thus replicating the “phoneme boundary effect” seen in adult listeners and in human infants as young as I month of age. The findings are discussed in terms of their specific impact on accounts of voicing perception in human listeners and in terms of their impact on discussions of the evolution of language.  相似文献   

12.
Speech perception can be viewed in terms of the listener’s integration of two sources of information: the acoustic features transduced by the auditory receptor system and the context of the linguistic message. The present research asked how these sources were evaluated and integrated in the identification of synthetic speech. A speech continuum between the glide-vowel syllables /ri/ and /li/ was generated by varying the onset frequency of the third formant. Each sound along the continuum was placed in a consonant-cluster vowel syllable after an initial consonant /p/, /t/, /s/, and /v/. In English, both /r/ and /l/ are phonologically admissible following /p/ but are not admissible following /v/. Only /l/ is admissible following /s/ and only /r/ is admissible following /t/. A third experiment used synthetic consonant-cluster vowel syllables in which the first consonant varied between /b/ and /d and the second consonant varied between /l/ and /r/. Identification of synthetic speech varying in both acoustic featural information and phonological context allowed quantitative tests of various models of how these two sources of information are evaluated and integrated in speech perception.  相似文献   

13.
In earlier work we have shown that adults, young children, and infants are capable of computing transitional probabilities among adjacent syllables in rapidly presented streams of speech, and of using these statistics to group adjacent syllables into word-like units. In the present experiments we ask whether adult learners are also capable of such computations when the only available patterns occur in non-adjacent elements. In the first experiment, we present streams of speech in which precisely the same kinds of syllable regularities occur as in our previous studies, except that the patterned relations among syllables occur between non-adjacent syllables (with an intervening syllable that is unrelated). Under these circumstances we do not obtain our previous results: learners are quite poor at acquiring regular relations among non-adjacent syllables, even when the patterns are objectively quite simple. In subsequent experiments we show that learners are, in contrast, quite capable of acquiring patterned relations among non-adjacent segments-both non-adjacent consonants (with an intervening vocalic segment that is unrelated) and non-adjacent vowels (with an intervening consonantal segment that is unrelated). Finally, we discuss why human learners display these strong differences in learning differing types of non-adjacent regularities, and we conclude by suggesting that these contrasts in learnability may account for why human languages display non-adjacent regularities of one type much more widely than non-adjacent regularities of the other type.  相似文献   

14.
By employing new methods of analysis to the physical signal, a number of researchers have provided evidence which suggests that there may be invariant acoustic cues which serve to identify the presence of particular phonetic segments (e.g., Kewley-Port, 1980; Searle, Jacobson, & Rayment, 1979; Stevens & Blumstein, 1978. Whereas previous studies have focused upon the existence of invariant properties present in the physical stimulus, the present study examines the existence of any invariant information available in the psychological stimulus. For this purpose, subjects were asked to classify either a series of full-CV syllables ([bi], [bε], [bo], [??], [di], [d∈], [do], [??]) or one of two series of chirp stimuli consisting of information available in the first 30 meec of each syllable. The full-formant chirp stimuli consisted of the first 30 msec of each syllable, whereas the two-formant chirps were composed of the first 30 msec of only the second and third formants. The object of the present study was to determine whether or not there was sufficient information available in either the full- or two formant chirp series to allow subjects to group the stimuli into two classes corresponding to the identity of the initial consonant of the syllables (i.e., [b], or [d]). A series of classification tasks were used, ranging from a completely free sorting task to a perceptual learning task with experimenter-imposed classifications. The results suggest that there is information available in the full-formant chirps, but not in the two-formant chirps, which allows subjects to group the sounds into classes corresponding to the identity of the initial consonant sounds.  相似文献   

15.
Although many individual speech contrasts pairs have been studied within the cross-language literature, no one has created a comprehensive and systematic set of such stimuli. This article justifies and details an extensive set of contrast pairs for Mandarin Chinese and American English. The stimuli consist of 180 pairs of CVC syllables recorded in two tokens each (720 syllables total). Between each CVC pair, two of the segments are identical, whereas the third differs in that a segment drawn from a "native" phonetic category (either Mandarin, English, or both) is partnered with a segment drawn from a "foreign" phonetic category (nonnative to Mandarin, English, or both). Each contrast pair differs by a minimal phonetic amount and constitutes a meaningful contrast among the world's languages (as cataloged in the UCLA Phonological Segment Inventory Database of 451 languages). The entire collection of phonetic differences envelops Mandarin and English phonetic spaces and generates a range of phonetic discriminability. Contrastive segments are balanced through all possible syllable positions, with noncontrastive segments being filled in with other "foreign" segments. Although intended to measure phonetic perceptual sensitivity among adult speakers of the two languages, these stimuli are offered here to all for similar or for altogether unrelated investigations.  相似文献   

16.
An experiment was conducted which assessed the relative contributions of three acoustic cues to the distinction between stop consonant and semivowel in syllable initial position. Subjects identified three series of syllables which varied perceptually from [ba] to [wa]. The stimuli differed only in the extent, duration, and rate of the second formant transition. In each series, one of the variables remained constant while the other two changed. Obtained identification ratings were plotted as a function of each variable. The results indicated that second formant transition duration and extent contribute significantly to perception. Short second formant transition extents and durations signal stops, while long second formant transition extents and durations signal semivowels. It was found that second formant transition rate did not contribute significantly to this distinction. Any particular rate could signal either a stop or semivowel. These results are interpreted as arguing against models that incorporate transition rate as a cue to phonetic distinctions. In addition, these results are related to a previous selective adaptation experiment. It is shown that the “phonetic” interpretation of the obtained adaptation results was not justified.  相似文献   

17.
The present study was undertaken to investigate the effects of syllabic stress and segment structure on selective adaptation in speech. To this end, a CV place of articulation test continuum was selectively adapted by seven different adapting stimuli; the monosyllables [ba] and [ga], two disyllabic stimuli containing equal stress on both syllables, [baga] and [gabal, and three disyllabic stimuli ([baga]) in which stress placement varied and was cued by the acoustic parameters of fundamental frequency and duration. Results for the two adapting stimuli demonstrated significant [b] adaptation for the stimulus [ba] and significant [g] adaptation for [gal. Of the five other adapting stimuli, only [g] adaptation for the stimulus [bagá] was found to be significant. These findings indicate that the operation of detector mechanisms susceptible to fatigue by an adapting stimulus are even more constrained than has heretofore been suggested. It appears that the adapting and test stimuli must not only have the same phonetic and syllable structure, but also the same syllabic organization.  相似文献   

18.
Ebbinghaus noted that there were great differences in nonsense syllables in ease of learning. Later investigators attempted to control for this variability but made little effort to account for the differences. An approach from the point of view of systematic linguistics suggests that a major source of variability can be found in the phonetic and orthographic "distance from English." A scale of phonetic distance and a scale of orthographic distance combined in multiple regression to predict association value and meaningfulness with R above +.80. Experimental tests suggest that subjects are highly sensitive to violations of the rules of syllable structure even when the syllables are very unlike English. It is suggested that nonsense syllables do equate for prior knowledge across subjects because all subjects are highly familiar with the phonetic and orthographic rules that contribute heavily to the meaningfulness of nonsense syllables.  相似文献   

19.
Cholin J  Levelt WJ  Schiller NO 《Cognition》2006,99(2):205-235
In the speech production model proposed by [Levelt, W. J. M., Roelofs, A., Meyer, A. S. (1999). A theory of lexical access in speech production. Behavioral and Brain Sciences, 22, pp. 1-75.], syllables play a crucial role at the interface of phonological and phonetic encoding. At this interface, abstract phonological syllables are translated into phonetic syllables. It is assumed that this translation process is mediated by a so-called Mental Syllabary. Rather than constructing the motor programs for each syllable on-line, the mental syllabary is hypothesized to provide pre-compiled gestural scores for the articulators. In order to find evidence for such a repository, we investigated syllable-frequency effects: If the mental syllabary consists of retrievable representations corresponding to syllables, then the retrieval process should be sensitive to frequency differences. In a series of experiments using a symbol-position association learning task, we tested whether high-frequency syllables are retrieved and produced faster compared to low-frequency syllables. We found significant syllable frequency effects with monosyllabic pseudo-words and disyllabic pseudo-words in which the first syllable bore the frequency manipulation; no effect was found when the frequency manipulation was on the second syllable. The implications of these results for the theory of word form encoding at the interface of phonological and phonetic encoding; especially with respect to the access mechanisms to the mental syllabary in the speech production model by (Levelt et al.) are discussed.  相似文献   

20.
To better understand how infants process complex auditory input, this study investigated whether 11-month-old infants perceive the pitch (melodic) or the phonetic (lyric) components within songs as more salient, and whether melody facilitates phonetic recognition. Using a preferential looking paradigm, uni-dimensional and multi-dimensional songs were tested; either the pitch or syllable order of the stimuli varied. As a group, infants detected a change in pitch order in a 4-note sequence when the syllables were redundant (experiment 1), but did not detect the identical pitch change with variegated syllables (experiment 2). Infants were better able to detect a change in syllable order in a sung sequence (experiment 2) than the identical syllable change in a spoken sequence (experiment 1). These results suggest that by 11 months, infants cannot “ignore” phonetic information in the context of perceptually salient pitch variation. Moreover, the increased phonetic recognition in song contexts mirrors findings that demonstrate advantages of infant-directed speech. Findings are discussed in terms of how stimulus complexity interacts with the perception of sung speech in infancy.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号