首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
This three-part study demonstrates that perceptual order can influence the integration of acoustic speech cues. In Experiment 1, the subjects labeled the [s] and [sh] in natural FV and VF syllables in which the frication was replaced with synthetic stimuli. Responses to these "hybrid" stimuli were influenced by cues in the vocalic segment as well as by the synthetic frication. However, the influence of the preceding vocalic cues was considerably weaker than was that of the following vocalic cues. Experiment 2 examined the acoustic bases for this asymmetry and consisted of analyses revealing that FV and VF syllables are similar in terms of the acoustic structures thought to underlie the vocalic context effects. Experiment 3 examined the perceptual bases for the asymmetry. A subset of the hybrid FV and VF stimuli were presented in reverse, such that the acoustic and perceptual bases for the asymmetry were pitted against each other in the listening task. The perceptual bases (i.e., the perceived order of the frication and vocalic cues) proved to be the determining factor. Current auditory processing models, such as backward recognition masking, preperceptual auditory storage, or models based on linguistic factors, do not adequately account for the observed asymmetries.  相似文献   

2.
We examined whether children modify their perceptual weighting strategies for speech on the basis of the order of segments within a syllable, as adults do. To this end, fricative-vowel (FV) and vowel-fricative (VF) syllables were constructed with synthetic noises from an/[symbol: see text]/-to-/s/continuum combined with natural/a/and/u/portions with transitions appropriate for a preceding or a following /[symbol: see text]/or/s/. Stimuli were played in their original order to adults and children (ages of 7 and 5 years) in Experiment 1 and in reversed order in Experiment 2. The results for adults and, to a lesser extent, those for 7-year-olds replicated earlier results showing that adults assign different perceptual weights to acoustic properties, depending on segmental order. In contrast, results for 5-year-olds suggested that these listeners applied the same strategies during fricative labeling, regardless of segmental order. Thus, the flexibility to modify perceptual weighting strategies for speech according to segmental order apparently emerges with experience.  相似文献   

3.
采用语境效应范式,以汉语听者为被试,在三个实验中考察了塞辅音声学信息和语音信息激活的时间进程。实验1语境刺激是/ta/、/ka/音节和/ta/、/ka/塞音段的声学模拟音,目标刺激是/ta/-/ka/对比连续体,结果发现,塞音声学信息激活没有产生语境效应。实验2语境刺激是/ta/、/ka/音节和/ta/、/ka/塞音段,结果发现塞音语音信息激活产生了显著的对比语境效应。实验3变化塞音段和目标刺激之间的间隔,系统考察塞音范畴通达的时间进程,结果发现,塞音知觉中听觉加工阶段向语音加工阶段的转变约发生于刺激加工后120 ms。实验结果揭示了塞辅音知觉中音位范畴通达的时间进程。  相似文献   

4.
The acoustic cues to the phonetic identity of diphthongs normally include both spectral quality and dynamic change. This fact was exploited in a series of selective adaptation experiments examining the possibility of mutual adaptive effects between these two types of acoustic cues. One continuum of syllables varying from [εi] to [εd] and another varying from [ε] to [εi] were synthesized; endpoint stimuli of both series used as adaptors caused identification boundaries to be shifted. Cross-series adaptation was also attempted on the [ε?εi] stimuli, using [?], [∞], and [ai]. Only [ai] proved effective as an adaptor, suggesting the mediation of a rather abstract auditory level of similarity. The results argue strongly against interpretations in terms of feature detectors, but appear compatible with an “auditory contrast” explanation, which might in turn be incorporated within adaptation level theory in the form recently discussed by Restle (1978). The cross-series results further suggest that selective adaptation might be used to quantify the perceptual distance between auditory cues in speech.  相似文献   

5.
When the auditory and visual components of spoken audiovisual nonsense syllables are mismatched, perceivers produce four different types of perceptual responses, auditory correct, visual correct, fusion (the so-called McGurk effect), and combination (i.e., two consonants are reported). Here, quantitative measures were developed to account for the distribution of the four types of perceptual responses to 384 different stimuli from four talkers. The measures included mutual information, correlations, and acoustic measures, all representing audiovisual stimulus relationships. In Experiment 1, open-set perceptual responses were obtained for acoustic /bɑ/ or /lɑ/ dubbed to video /bɑ, dɑ, gɑ, vɑ, zɑ, lɑ, wɑ, eɑ/. The talker, the video syllable, and the acoustic syllable significantly influenced the type of response. In Experiment 2, the best predictors of response category proportions were a subset of the physical stimulus measures, with the variance accounted for in the perceptual response category proportions between 17% and 52%. That audiovisual stimulus relationships can account for perceptual response distributions supports the possibility that internal representations are based on modality-specific stimulus relationships.  相似文献   

6.
The musical quality of timbre is based on both spectral and dynamic acoustic cues. Four 2-part experiments examined whether these properties are represented in the mental image of a musical timbre. Experiment 1 established that imagery occurs for timbre variations within a single musical instrument, using plucked and bowed tones from a cello. Experiments 2 and 3 used synthetic stimuli that varied in either spectral or dynamic properties only, to investigate imagery with strict acoustic control over the stimuli. Experiment 4 explored whether the dimension of loudness is stored in an auditory image. Spectral properties appear to play a much larger role than dynamic properties in imagery for musical timbre.  相似文献   

7.
8.
Impairment of auditory perception and language comprehension in dysphasia   总被引:3,自引:0,他引:3  
Men with chronic focal brain wounds were examined for their ability to discriminate complex tones, synthesized steady-state vowels, and synthesized consonant—vowel syllables. Subjects with left hemisphere damage, but not right hemisphere damage, were impaired in their ability to respond correctly to rapidly changing acoustic stimuli, regardless of whether stimuli were verbal or nonverbal. The degree of impairment in auditory processing correlated highly with the degree of language comprehension impairment. The pattern of impairment of the group with left hemisphere damage on these perceptual tests was similar to that found in children with developmental language disorders.  相似文献   

9.
For native speakers of English and several other languages, preceding vocalic duration andFi offset frequency are two of the cues that convey the stop consonant voicing distinction in wordfinal position. For speakers learning English as a second language, there are indications that use of vocalic duration, but notFl offset frequency, may be hindered by a lack of experience with phonemic (i.e., lexical) vowel length (the “phonemic vowel length account”: Crowther & Mann, 1992). In this study, native speakers of Arabic, a language that includes a phonemic vowel length distinction, were tested for their use of vocalic duration andF1 offset in production and perception of the English consonant-vowel-consonant forms pod and pot. The phonemic vowel length hypothesis predicts that Arabic speakers should use vocalic duration extensively in production and perception. On the contrary, experiment l repealed that, consistent with Flege and Port’s (1981) findings, they produced only slightly (but significantly) longer vocalic segments in their pod tokens. It further indicated that their productions showed a significant variation inFl offset as a function of final stop voicing. Perceptual sensitivity to vocalic duration andFl offset as voicing cues was tested in two experiments. In experiment 2, we employed a factorial combination of these two cues and a finely spaced vocalic duration continuum. Arabic speakers did not appear to be very sensitive to vocalic duration, but they were abort as sensitive as native English speakers toF1 offset frequency. In Experiment 3, we employed a one-dimensional continuum of more widely spaced stimuli that varied only vocalic duration. Arabic speakers showed native-English-like sensitivity to vocalic duration- Anexplanation based on tie perceptual anchor theory of context coding (Braida et al., 1984; Macmillan, 1987; Macmillan, Braida, & Goldberg, 1987) and phoneme perception theory (Schouten & Van Hessen, 2992) is offered to reconcile the apparently contradictory perceptual findings. The explanation does not attribute native-English-like voicing perception to the Ambit subjects. The findings in this study call fox a modification of the phonemic vowel length hypothesis.  相似文献   

10.
Several experiments investigate voicing judgments in minimal pairs likerabid-rapid when the duration of the first vowel and the medial stop are varied factorially and other cues for voicing remain ambiguous. In Experiments 1 and 2, in which synthetic labial and velar-stop voicing pairs are investigated, the perceptual boundary along a continuum of silent consonant durations varies in constant proportion to increases in the duration of the preceding vocalic interval. In Experiment 3, it is shown that speaking tempo external to the test word has far smaller effects on a closure duration boundary for voicing than does the tempo within the test word. Experiment 4 shows that, even within the word, it is primarily the preceding vowel that accounts for changes in the consonant duration effects. Furthermore, in Experiments 3 and 4, the effects of timing outside the vowel-consonant interval are independent of the duration of that interval itself. These findings suggest that consonant/vowel ratio serves as a primary acoustic cue for English voicing in syllable-final position and imply that this ratio possibly is directly extracted from the speech signal.  相似文献   

11.
Trading relations show that diverse acoustic consequences of minimal contrasts in speech are equivalent in perception of phonetic categories. This perceptual equivalence received stronger support from a recent finding that discrimination was differentially affected by the phonetic cooperation or conflict between two cues for the /slIt/-/splIt/contrast. Experiment 1 extended the trading relations and perceptual equivalence findings to the /sei/-/stei/contrast. With a more sensitive discrimination test, Experiment 2 found that cue equivalence is a characteristic of perceptual sensitivity to phonetic information. Using “sine-wave analogues” of the /sei/-/stei/stimuli, Experiment 3 showed that perceptual integration of the cues was phonetic, not psychoacoustic, in origin. Only subjects who perceived the sine-wave stimuli as “say” and “stay” showed a trading relation and perceptual equivalence; subjects who perceived them as nonspeech failed to integrate the two dimensions perceptually. Moreover, the pattern of differences between obtained and predicted discrimination was quite similar across the first two experiments and the “say”-“stay” group of Experiment 3, and suggested that phonetic perception was responsible even for better-than-predicted performance by these groups. Trading relations between speech cues, and the perceptual equivalence that underlies them, thus appear to derive specifically from perception of phonetic information.  相似文献   

12.
The effects of viewing the face of the talker (visual speech) on the processing of clearly presented intact auditory stimuli were investigated using two measures likely to be sensitive to the articulatory motor actions produced in speaking. The aim of these experiments was to highlight the need for accounts of the effects of audio-visual (AV) speech that explicitly consider the properties of articulated action. The first experiment employed a syllable-monitoring task in which participants were required to monitor for target syllables within foreign carrier phrases. An AV effect was found in that seeing a talker's moving face (moving face condition) assisted in more accurate recognition (hits and correct rejections) of spoken syllables than of auditory-only still face (still face condition) presentations. The second experiment examined processing of spoken phrases by investigating whether an AV effect would be found for estimates of phrase duration. Two effects of seeing the moving face of the talker were found. First, the moving face condition had significantly longer duration estimates than the still face auditory-only condition. Second, estimates of auditory duration made in the moving face condition reliably correlated with the actual durations whereas those made in the still face auditory condition did not. The third experiment was carried out to determine whether the stronger correlation between estimated and actual duration in the moving face condition might have been due to generic properties of AV presentation. Experiment 3 employed the procedures of the second experiment but used stimuli that were not perceived as speech although they possessed the same timing cues as those of the speech stimuli of Experiment 2. It was found that simply presenting both auditory and visual timing information did not result in more reliable duration estimates. Further, when released from the speech context (used in Experiment 2), duration estimates for the auditory-only stimuli were significantly correlated with actual durations. In all, these results demonstrate that visual speech can assist in the analysis of clearly presented auditory stimuli in tasks concerned with information provided by viewing the production of an utterance. We suggest that these findings are consistent with there being a processing link between perception and action such that viewing a talker speaking will activate speech motor schemas in the perceiver.  相似文献   

13.
Five-year-old children were tested for perceptual trading relations between a temporal cue (silence duration) and a spectral cue (F1 onset frequency) for the “say-stay” distinction. Identification functions were obtained for two synthetic “say-stay” continua, each containing systematic variations in the amount of silence following the /s/ noise. In one continuum, the vocalic portion had a lower F1 onset than in the other continuum. Children showed a smaller trading relation than has been found with adults. They did not differ from adults, however, in their perception of an “ay-day” continuum formed by varying F1 onset frequency only. The results of a discrimination task in which the two acoustic cues were made to “cooperate” or “conflict” phonetically supported the notion of perceptual equivalence of the temporal and spectral cues along a single phonetic dimension. The results indicate that young children, like adults, perceptually integrate multiple cues to a speech contrast in a phonetically relevant manner, but that they may not give the same perceptual weights to the various cues as do adults.  相似文献   

14.
This study investigated whether consonant phonetic features or consonant acoustic properties more appropriately describe perceptual confusions among speech stimuli in multitalker babble backgrounds. Ten normal-hearing subjects identified 19 consonants, each paired with /a/, 1–19 and lui in a CV format. The stimuli were presented in quiet and in three levels of babble. Multidimensional scaling analyses of the confusion data retrieved stimulus dimensions corresponding to consonant acoustic parameters. The acoustic dimensions identified were: periodicity/burst onset, friction duration, consonant-vowel ratio, second formant transition slope, and first formant transition onset. These findings are comparable to previous reports of acoustic effects observed in white-noise conditions, and support the theory that acoustic characteristics are the relevant perceptual properties of speech in noise conditions. Perceptual effects of vowel context and level of the babble also were observed. These condition effects contrast with those previously reported for white-noise interference, and are attributed to direct masking of the low-frequency acoustic cues in the nonsense syllables by the low-frequency spectrum of the babble.  相似文献   

15.
This research examines the recognition of two-syllable spoken words and the means by which the auditory word recognition process deals with ambiguous stimulus information. The experiments reported here investigate the influence of individual syllables within two-syllable words on the recognition of each other. Specifically, perceptual identification of two-syllable words comprised of two monosyllabic words (spondees) was examined. Individual syllables within a spondee were characterized as either "easy" or "hard" depending on the syllable's neighborhood characteristics; an easy syllable was defined as a high-frequency word in a sparse neighborhood of low-frequency words, and a hard syllable as a low-frequency word in a high-density, high-frequency neighborhood. In Experiment 1, stimuli were created by splicing together recordings of the component syllables of the spondee, thus equating for syllable stress. Additional experiments tested the perceptual identification of naturally produced spondees, spliced nonwords, and monosyllables alone. Neighborhood structure had a strong effect on identification in all experiments. In addition, identification performance for spondees with a hard-easy syllable pattern was higher than for spondees with an easy-hard syllable pattern, indicating a primarily retroactive pattern of influence in spoken word recognition. Results strongly suggest that word recognition involves multiple activation and delayed commitment, thus ensuring accurate and efficient recognition.  相似文献   

16.
It has previously been shown that pigeons can shift attention between parts and wholes of complex stimuli composed of larger, "global" characters constructed from smaller, "local" characters. The base-rate procedure used biased target level within any condition at either the local or global level; targets were more likely at one level than at the other. Biasing of target level in this manner demonstrated shifts of local/global attention over a time span consisting of several days with a fixed base rate. Experiment 1 examined the possibility that pigeons can shift attention between local and global levels of perceptual analysis in seconds rather than days. The experiment used priming cues the color of which predicted on a trial-by-trial basis targets at different perceptual levels. The results confirmed that pigeons, like humans, can display highly dynamic stimulus-driven shifts of local/global attention. Experiment 2 changed spatial relations between features of priming cues and features of targets within a task otherwise similar to that used in experiment 1. It was predicted that this change in cues might affect asymmetry but not the occurrence of a priming effect. A priming effect was again obtained, thereby providing generality to the claim that pigeons can learn that trial-by-trial primes predict targets at different levels of perceptual analysis. Pigeons can display perceptual, stimulus-driven priming of a highly dynamic nature. Electronic Publication  相似文献   

17.
Two- and 3-month-old infants were found to discriminate the acoustic cues for the phonetic feature of place of articulation in a categorical manner; that is, evidence for the discriminability of two synthetic speech patterns was present only when the stimuli signaled a change in the phonetic feature of place. No evidence of discriminability was found when two stimuli, separated by the same acoustic difference, signaled acoustic variations of the same phonetic feature. Discrimination of the same acoustic cues in a nonspeech context was found, in contrast, to be noncategorical or continuous. The results were discussed in terms of infants’ ability to process acoustic events in either an auditory or a linguistic mode.  相似文献   

18.
The processing of speech and nonspeech sounds by 23 reading disabled children and their age- and sex-matched controls was examined in a task requiring them to identify and report the order of pairs of stimuli. Reading disabled children were impaired in making judgments with very brief tones and with stop consonant syllables at short interstimulus intervals (ISI's). They had no unusual difficulty with vowel stimuli, vowel stimuli in a white noise background, or very brief visual figures. Poor performance on the tones and stop consonants appears to be due to specific difficulty in processing very brief auditory cues. The reading disabled children also showed deficits in the perception of naturally produced words, less sharply defined category boundaries, and a greater reliance on context in making phoneme identifications. The results suggest a perceptual deficit in some reading disabled children, which interferes with the processing of phonological information.  相似文献   

19.
This research investigated the effect of stimulus type and directed attention on dichotic listening performance with children. A sample of 12 (5 male, 7 female; mean age 10.5 years) high academically performing children were administered four types of dichotic stimuli (words, digits, CV syllables, and melodies) in three experimental conditions (free recall, directed left, and directed right) to examine perceptual asymmetry as reflected by the right-ear advantage (REA). While the expected REA for words and CV syllables and the expected LEA for melodies were found under free recall, the directed conditions produced varied results depending on the nature of the stimuli. Directed condition had no effect on recall of CV syllables but had a dramatic effect on recall of digits. Word stimuli and directed condition interacted to produce inconsistent perceptual asymmetry while directed condition reduced overall recall for melodies. The findings lend support to the hypothesis that perceptual asymmetries can be strongly influenced by the type of stimulus material used and the effect of attentional strategy employed.  相似文献   

20.
Despite spectral and temporal discontinuities in the speech signal, listeners normally report coherent phonetic patterns corresponding to the phonemes of a language that they know. What is the basis for the internal coherence of phonetic segments? According to one account, listeners achieve coherence by extracting and integrating discrete cues; according to another, coherence arises automatically from general principles of auditory form perception; according to a third, listeners perceive speech patterns as coherent because they are the acoustic consequences of coordinated articulatory gestures in a familiar language. We tested these accounts in three experiments by training listeners to hear a continuum of three-tone, modulated sine wave patterns, modeled after a minimal pair contrast between three-formant synthetic speech syllables, either as distorted speech signals carrying a phonetic contrast (speech listeners) or as distorted musical chords carrying a nonspeech auditory contrast (music listeners). The music listeners could neither integrate the sine wave patterns nor perceive their auditory coherence to arrive at consistent, categorical percepts, whereas the speech listeners judged the patterns as speech almost as reliably as the synthetic syllables on which they were modeled. The outcome is consistent with the hypothesis that listeners perceive the phonetic coherence of a speech signal by recognizing acoustic patterns that reflect the coordinated articulatory gestures from which they arose.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号