When the (vocalic) formant transitions appropriate for the stops in a synthetic approximation to [spa] or [sta] are presented to one ear and the remainder of the acoustic pattern to the other, listeners report a duplex percept. One side of the duplexity is the same coherent syllable ([spa] or [sta]) that is perceived when the pattern is presented in its original, undivided form; the other is a nonspeech chirp that corresponds to what the transitions sound like in isolation. This phenomenon is here used to determine why, in the case of stops, silence is an important cue. The results show that the silence cue affects the formant transitions differently when, on the one side of the duplex percept, the transitions support the perception of stop consonants, and when, on the other, they are perceived as nonspeech chirps. This indicates that the effectiveness of the silence cue is owing to distinctively phonetic (as against generally auditory) processes.  相似文献   

Kindergarten and second-grade children's perception of voicing distinctions among the stop consonants was investigated by assessing their ability to identify and discriminate a series of synthetic speech stimuli varying in voice onset time (VOT). Perception of these sounds was found to be nearly categorical. No differences between the two age groups in either identification or discrimination performance were found; furthermore, the children's performance was comparable to adult performance in other studies using these stimuli. When considered in the context of data on the perception of VOT by infants as well as adults, the results suggest that the differential discriminability of stimuli along the VOT continuum has a biological basis.  相似文献   

The stop consonants /b, d, g, p, t, k/were recorded before/i/,/a/,/u/. The energy spectrum for each stop consonant was removed from its original vowel and spliced onto a different steady-state vowel. Results of a recognition test revealed that consonants were accurately recognized in all cases except when /k/ or/g[ was spliced from/i/to/u/. Further demonstrations suggested that/k/ and /g/ do have invariant characteristics before/i/, /a/, and /u/. These results support the general notion that stop consonants may be recognized before different vowels in normal speech in terms of invariant acoustic features.  相似文献   

The place of articulation of intervocalic stop consonants is conveyed by temporally distributed spectral information, viz, the formant transitions preceding and following the silent closure interval (VC and CV transitions). Experiment 1 shows that more than 200 msec of silent closure is needed to hear VC and CV formant transitions as separate phonemic events (geminate stops). As closure duration is reduced, these cues are integrated into a single phonemic percept, and the VC transitions become increasingly redundant (Experiments 2 and 3). VC and CV transitions conveying different places of articulation, on the other hand, are heard as separate phonemes at closure durations as short as 100 msec. If closure duration is further reduced, a single stop is heard whose place of articulation corresponds to the CV transitions (Experiment 3). Even in the absence of CV transitions, VC transitions carry little perceptual weight at very short closure durations (Experiment 4). Despite their apparent redundancy, however, the VC transitions exert a positive bias on the perception of CV transitions at very short closure durations. At closure durations beyond 100 msec, on the other hand, VC and CV transitions interact contrastively in perception and tend to be heard as different phonemes (Experiments 5 and 6). The results of these experiments suggest two different processes of temporal integration in phonetic perception, one taking place at a precategorical level, the other combining identical phoneme categories within a certain time span.  相似文献   

We examined the possible relevance of locus equations to human production and perception of stop consonants. The orderly output constraint (OOC) of Sussman, Frachter, and Cable (1995) claims that humans have evolved to produce speech such thatF2 at consonant release andF2 at vowel midpoint are linearly related for consonants so that developing perceptual systems can form representations in anF2 ons-by-F2 vowel space. The theory claims that this relationship described by locus equations can distinguish consonants, and that the linearity of locus equations is captured in neural representations and is thus perceptually relevant. We investigated these claims by testing how closely locus equations reflect the production and perception of stop consonants. In Experiment 1, we induced speakers to change their locus equation slope and intercept parameters systematically, but found that consonants remained distinctive in slope-by-intercept space. In Experiment 2, we presented stop-consonant syllables with their bursts removed to listeners, and compared their classification error matrices with the predictions of a model using locus equation prototypes and with those of an exemplar-based model that usesF2 ons andF2 vowel, but not locus equations. Both models failed to account for a large proportion of the variance in listeners’ responses; the locus equation model was no better in its predictions than the exemplar model. These findings are discussed in the context of the OOC.  相似文献   

The possibility that phonological confusions may underlie some difficulties in processing written language was investigated using four speech perception tasks. Twelve dyslexic and four normal-reading children identified and discriminated synthetic speech syllables which varied either in voice-onset time (signaling the feature of voicing) or direction of formant transitions (signaling place of articulation). Results indicate that, like normal-reading children and adults, dyslexic children perceive these sounds categorically. Discrimination of the stimuli was limited by their identifiability. It is suggested that linguistic disturbances at other stages of the grapheme to meaning transformation underlie misreading.  相似文献   

Phonetic segments are coarticulated in speech. Accordingly, the articulatory and acoustic properties of the speech signal during the time frame traditionally identified with a given phoneme are highly context-sensitive. For example, due to carryover coarticulation, the front tongue-tip position for /1/ results in more fronted tongue-body contact for a /g/ preceded by /1/ than for a /g/ preceded by /r/. Perception by mature listeners shows a complementary sensitivity--when a synthetic /da/-/ga/ continuum is preceded by either /al/ or /ar/, adults hear more /g/s following /l/ rather than /r/. That is, some of the fronting information in the temporal domain of the stop is perceptually attributed to /l/ (Mann, 1980). We replicated this finding and extended it to a signal-detection test of discrimination with adults, using triads of disyllables. Three equidistant items from a /da/-/ga/ continuum were used preceded by /al/ and /ar/. In the identification test, adults had identified item ga5 as "ga,' and dal as "da,' following both /al/ and /ar/, whereas they identified the crucial item d/ga3 predominantly as "ga' after /al/ but as "da' after /ar/. In the discrimination test, they discriminated d/ga3 from da1 preceded by /al/ but not /ar/; compatibly, they discriminated d/ga3 readily from ga5 preceded by /ar/ but poorly preceded by /al/. We obtained similar results with 4-month-old infants. Following habituation to either ald/ga3 or ard/ga3, infants heard either the corresponding ga5 or da1 disyllable. As predicted, the infants discriminated d/ga3 from da1 following /al/ but not /ar/; conversely, they discriminated d/ga3 from ga5 following /ar/ but not /al/. The results suggest that prelinguistic infants disentangle consonant-consonant coarticulatory influences in speech in an adult-like fashion.  相似文献   

Identification and discrimination of two-formant [bae-dae-gae] and [pae-tae-kae] synthetic speech stimuli and discrimination of corresponding isolated second formant transitions (chirps) were performed by six subjects. Stimuli were presented at several intensity levels such that the intensity of the F2 transition was equated between speech and nonspeech stimuli, or the overall intensity of the stimulus was equated. At higher intensity (92 dB), b-d-g and p-t-k identification and between-category discrimination performance declined and bilabial-alveolar phonetic boundaries shifted in location on the continuum towards the F2 steady-state frequency. Between-category discrimination improved from performance at 92 dB when 92-dB speech stimuli were simultaneously masked by 60-dB speech noise; alveolar-velar boundaries shifted to a higher frequency location in the 92-dB-plus-noise condition. Chirps were discriminated categorically when presented at 58 dB, but discrimination peaks declined at higher intensities. Perceptual performance for chirps and p-t-k stimuli was very similar, and slightly inferior to performance for b-d-g stimuli, where simultaneous masking by F1 resulted in a lower effective intensity of F2. The results were related to a suggested model involving pitch comparison and transitional quality perceptual strategies.  相似文献   

Acoustic cues for the perception of place of articulation in aphasia   总被引:1,自引:0,他引:1  
Two experiments assessed the abilities of aphasic patients and nonaphasic controls to perceive place of articulation in stop consonants. Experiment I explored labeling and discrimination of [ba, da, ga] continua varying in formant transitions with or without an appropriate burst onset appended to the transitions. Results showed general difficulty in perceiving place of articulation for the aphasic patients. Regardless of diagnostic category or auditory language comprehension score, discrimination ability was independent of labeling ability, and discrimination functions were similar to normals even in the context of failure to reliably label the stimuli. Further there was less variability in performance for stimuli with bursts than without bursts. Experiment II measured the effects of lengthening the formant transitions on perception of place of articulation in stop consonants and on the perception of auditory analogs to the speech stimuli. Lengthening the transitions failed to improve performance for either the speech or nonspeech stimuli, and in some cases, reduced performance level. No correlation was observed between the patient's ability to perceive the speech and nonspeech stimuli.  相似文献   

One of the basic questior, s that models of speech perception must answer concerns the conditions under which various cues will be extracted from a stimulus and the nature of the mechanisms which mediate this process. Two selective adaptation experiments were carried out to explore this question for the phonetic feature of place of articulation in both syllableinitial and syllable-final positions. In the first experiment, CV and VC stimuli were constructed with complete overlap in their second- and third-formant transitions. Despite this essentially complete overlap, no adaptation effects were found for a VC adaptor and a CV test series (or vice versa). In the second experiment, various vowel, vowel-like, and VC-like adaptors were used. The VC-like adaptors did have a significant effect on the CV category boundary, while the vowel and vowel-like stimuli did not. These results are interpreted within both one- and twolevel models of selective adaptation. These models are distinguished by whether selective adaptation is assumed to affect a single auditory level of processing or to affect both an auditory level and a later phonetic level. However, both models incorporate detectors at the auditory level which respond whenever particular formant transitions are present. These auditory detectors are not sensitive to the position of the consonant transition information within the syllable.  相似文献   

The work reported here investigated whether the extent of McGurk effect differs according to the vowel context, and differs when cross‐modal vowels are matched or mismatched in Japanese. Two audio‐visual experiments were conducted to examine the process of audio‐visual phonetic‐feature extraction and integration. The first experiment was designed to compare the extent of the McGurk effect in Japanese in three different vowel contexts. The results indicated that the effect was largest in the /i/ context, moderate in the /a/ context, and almost nonexistent in the /u/ context. This suggests that the occurrence of McGurk effect depends on the characteristics of vowels and the visual cues from their articulation. The second experiment measured the McGurk effect in Japanese with cross‐modal matched and mismatched vowels, and showed that, except with the /u/ sound, the effect was larger when the vowels were matched than when they were mismatched. These results showed, again, that the extent of McGurk effect depends on vowel context and that auditory information processing before phonetic judgment plays an important role in cross‐modal feature integration.  相似文献   

Monkeys were presented with synthetic speech stimuli in a shock-avoidance situation. On the basis of their behavior, perceptual boundaries were determined along the physical continua between /ba/ and /pa/, and /ga/ and /ka/, that were close to the human boundaries between voiced and voiceless consonants. As is the case with humans, discrimination across a boundary was better than discrimination between stimuli that were both on one side of the boundary, and there was generalization of the voiced-voiceless distinction from labial to velar syllables. Unlike humans, the monkeys showed large shifts in boundary when the range of stimuli was varied.  相似文献   

The basis for the invariant perception of place of articulation in pre- and postvocalic stops was investigated using the selective adaptation paradigm. Experiments 1 and 2 considered the role of identical bursts, mirror-image formant transitions, and similar onset and offset spectra in the invariant perception of place of articulation in CV and VC stimuli, and Experiment 3 considered the importance of the second two cues in a VCV context. The results of these experiments suggest that, at the level of processing tapped by selective adaptation, neither identical bursts, mirror-image formant transitions, nor similar onset and offset spectra are the basis for the invariant perception of place of articulation in initial and final position. The vowel portion of an adapter was found to affect perception of the consonant portion of a stimulus, and the direction of this effect was predictable from the acoustic characteristics of the consonant and vowel. The implications of these findings for the nature of selective adaptation are discussed.  相似文献   

The effects of selective adaptation on the perception of consonant-vowel (CV) stimuli varying in place of production was studied under two conditions. In the first condition, repeated presentation of a CV syllable produced an adaptation effect resulting in a shift in the locus of the phonetic boundary between [ba] and [da]. This result replicated previously reported findings. However, in the second condition, an adaptation effect was obtained on this same test series when the critical acoustic information (i.e., formant transitions) was present in final position of a VC speech-like syllable. These latter results support an auditory account of selective adaptation based on the spectral similarity of the adapting stimuli and test series rather than a more abstract linguistic account based on phonetic identity.  相似文献   

Previous research has shown that infants are capable of perceiving many phonetic distinctions between initial segments of syllables. The present study demonstrates that 2-month-old infants have the ability to distinguish syllables differing only in their final segments. Infants were found to be sensitive to place-of-articulation differences for stop consonants in final segments of both consonant-vowel-consonant (CVC) and vowel-consonant (VC) syllable pairs. Contrary to previous reports for older infants (Shvachkin, 1973), there was no indication that 2-month-olds have any more difficulty with contrasts of final-stop consonant:3 than they do with initial ones.  相似文献   

“Same”-“different” response latencies were measured for dichotic CV syllables at stimulus onset asynchronies (SOAs) of 20–480 msec. The results showed a consistent effect of distinctive feature separation on latencies and error rates, which persisted at longer SOAs. This supports the idea that phonemes are retained in memory in a distinctive feature code at least for half a second, and that such a code is also involved in the comparison of successive phonemes. The pattern of errors and latencies suggested two components in dichotic competition: “perceptual integration” at short SOAs and “perceptual noise” at longer SOAs. A “perceptual space” analogy is suggested: two stimuli must be separated by a minimum distance (including time separation as a dimension) to be perceived as distinct; if they are perceived as distinct, they are first categorized and then compared by assessing their distance in a phonetic space.  相似文献   

Experiments on selective adaptation have shown that the locus of the phonetic category boundary between two segments shifts after repetitive listening to an adapting stimulus. Theoretical interpretations of these results have proposed that adaptation occurs either entirely at an auditory level of processing or at both auditory and more abstract phonetic levels. The present experiment employed two alternating stimuli as adaptors in an attempt to distinguish between these two possible explanations. Two alternating stimuli were used as adaptors in order to test for the presence of contingent effects and to compare these results to simple adaptation using only a single adaptor. Two synthetic CV series with different vowels that varied the place of articulation of the consonant were employed. When two alternating adaptors were used, contingent adaptation effects were observed for the two stimulus series. The direction of the shifts in each series was governed by the vowel context of the adapting syllables. Using the single adaptor data, a comparison was made between the additive effects of the single adaptors and their combined effects when presented in alternating pairs. With voiced adaptors, only within-series adaptation effects were found, and these data were consistent with a on,level model of selective adaptation. However, for the voiceless adaptors, both within- and cross-series adaptation effects were found, suggesting the possible presence of two levels of adaptation to place of articulation. Further, the contingent adaptation effects with the voiceless adaptors seemed to be the result of the additive effects of the two alternating adaptors. This result indicates that previously reported contingent adaptation results may also reflect the net vowel specific adaptation effects after cancellation of other, nonvowel dependent effects and that caution is needed in interpreting such results.  相似文献   

Previous research has shown that 20-month-old infants can simultaneously learn two words that only differ by one of their consonants, but fail to do so when the words differ only by one of their vowels. This asymmetry was interpreted as developmental evidence for the proposal that consonants play a more important role than vowels in lexical specification. However, the consonant/vowel distinction was confounded with another distinction, that of the continuous status of the phonemes used (discontinuous stop consonants versus continuous vowels). The present study investigated 20-month-olds’ use of phonetic specificity while simultaneously learning two words that differ by a continuous consonant. The results obtained parallel those previously found for stop consonants, confirming the original claim of an asymmetry between the roles of consonants and vowels at the lexical level.  相似文献   

A series of experiments was conducted to examine the perceptual stability of stop consonants cued by silence alone, as when [s] + silence + [laet] is perceived as splat. Following a replication of this perceptual integration phenomenon (Experiment 1), attempts were made to block it by instructing subjects to disregard the initial [s] and to focus instead on the onset of the following signal, which was varied from [plaet] to [laet]. However, these instructions had little effect at short silence durations (Experiment 2), and they reduced stop percepts for only 2 subjects at longer silence durations (Experiment 3). That is, subjects were generally unable to voluntarily dissociate the [s] noise from the following signal and thus to perceive the silent interval as silence rather than as a carrier of phonetic information. A low-uncertainty paradigm facilitated the task somewhat (Experiment 4). However, when the [s] frication was replaced with broadband noise (Experiment 5), listeners had no trouble at all in the selective-attention task, except at very short silence durations (less than 40 ms). This last finding suggests that, except for the shortest durations, the effect of silence on phonetic perception does not arise at the level of psychoacoustic stimulus interactions. Rather, the results support the hypothesis that perceptual integration of speech components, including silence, is a largely obligatory perceptual function driven by the listener's tacit knowledge of phonetic regularities.  相似文献   

