首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Previous research has shown that the ratio of vowel to rhyme (vowel + consonant) duration is a major cue for quantity in Icelandic. In particular it serves as a higher-order invariant which enables the listener to disentangle those durational transformations of the speech signal which are "extrinsic" (e. g. due to changes in speaking rate) from those which are "intrinsic" to the phonemic message, involving a change of phonemic quantity. Previous research has been based on speech segment contrasts which are purely durational, involving vowels with a uniform spectrum whether phonemically long or short, such as [a] or [I]. This paper looks at the role of spectral factors in vowels which are spectrally dissimilar in their long and short varieties. It is shown that in these cases the spectral differences can be sufficiently great to override the previously established relational invariant for quantity. The implications of this finding for a model of quantity perception are discussed.  相似文献   

2.
Listeners are able to accurately recognize speech despite variation in acoustic cues across contexts, such as different speaking rates. Previous work has suggested that listeners use rate information (indicated by vowel length; VL) to modify their use of context-dependent acoustic cues, like voice-onset time (VOT), a primary cue to voicing. We present several experiments and simulations that offer an alternative explanation: that listeners treat VL as a phonetic cue rather than as an indicator of speaking rate, and that they rely on general cue-integration principles to combine information from VOT and VL. We demonstrate that listeners use the two cues independently, that VL is used in both naturally produced and synthetic speech, and that the effects of stimulus naturalness can be explained by a cue-integration model. Together, these results suggest that listeners do not interpret VOT relative to rate information provided by VL and that the effects of speaking rate can be explained by more general cue-integration principles.  相似文献   

3.
Speech segments are highly context-dependent and acoustically variable. One factor that contributes heavily to the variability of speech is speaking rate. Some speech cues are temporal in nature-that is, the distinctions that they signify are defined over time. How can temporal speech cues keep their distinctiveness in the face of extrinsic transformations, such as those wrought by different speaking rates? This issue is explored with respect to the perception, in Icelandic, of Voice Onset Time as a cue for word-initial stop voicing, wordinitial aspiration as a cue for \[h], and Voice Offset Time as a cue for pre-aspiration. All the speech cues show rate-dependent perception though to different degrees, with Voice Offset Time being most sensitive to rate changes and Voice Onset Time least sensitive. The differences in the behaviour of these speech cues are related to their different positions in the syllable.  相似文献   

4.
Recent work by Summerfield (1975) and others indicates that a listener’s phonemic judgments may vary with the utterance rate of prior context. In particular, if a phonemic distinction is signaled by a temporal cue such as voice onset time (VOT), faster utterance rates tend to shift the phoneme boundary toward smaller values of that cue. The listener thus appears to “normalize” temporal cues according to utterance rate. In the present experiment, subjects identified syllables varying in VOT ([ga]-[kha]) following either a slow or a fast version of the phrase “Teddy hears_ _ _ _ .” Typical normalization effects were observed when the precursor phrase and target syllable had formant frequencies corresponding to an adult male vocal tract. However, a reversal of the typical pattern (i.e., a shift in the perceived voicing boundary towardlarger values of VOT with an increased utterance rate) occurred when the precursor and target had formant frequencies corresponding to an adult female vocal tract. Both normalization and “reverse” normalization effects were reduced or eliminated under several conditions of source change between precursor and target. These conditions included a change in fundamental frequency, a change in implied vocal-tract size (as reflected in an upward or downward scaling of formant frequencies), or both.  相似文献   

5.
This study shows that the ratio of voice onset time (VOT) to syllable duration for /t/ and /d/ presents distributions with a stable boundary across speaking rates and that this boundary constitutes a perceptual criterion by which listeners judge the category affiliation of VOT. In Experiment 1, best-fit regression lines for VOT ratios of intervocalic /t/ and /d/ against speaking rate had zero slopes, and there was an inferable boundary between the distributions. In Experiment 2, listeners' identifications of syllable-initial stops conformed to this boundary ratio. In Experiment 3, VOT was held constant, while VOT ratios were altered by modifying the duration of the following vowel. As VOT ratios exceeded the boundary estimated from the data of Experiment 1, listeners' identifications shifted from /d/ to /t/. Timing relations in speech production can determine the identification of voicing categories across speaking rates.  相似文献   

6.
Three experiments demonstrated that the pattern of changes in articulatory rate in a precursor phrase can affect the perception of voicing in a syllable-initial prestress velar stop consonant. Fast and slow versions of a 10-word precursor phrase were recorded, and sections from each version were combined to produce several precursors with different patterns of change in articulatory rate. Listeners judged the identity of a target syllable, selected from a 7-member /gi/-ki/ voice-onset-time (VOT) continuum, that followed each precursor phrase after a variable brief pause. The major results were: (a) articulatory-rate effects were not restricted to the target syllable's immediate context; (b) rate effects depended on the pattern of rate changes in the precursor and not the amount of fast or slow speech or the proximity of fast or slow speech to the target syllable: and (c) shortening of the pause (or closure) duration led to a shortening of VOT boundaries rather than a lengthening as previously found in this phonetic context. Results are explained in terms of the role of dynamic temporal expectancies in determining the response to temporal information in speech, and implications for theories of extrinsic vs. intrinsic timing are discussed.  相似文献   

7.
We examine the evidence that speech and musical sounds exploit different acoustic cues: speech is highly dependent on rapidly changing broadband sounds, whereas tonal patterns tend to be slower, although small and precise changes in frequency are important. We argue that the auditory cortices in the two hemispheres are relatively specialized, such that temporal resolution is better in left auditory cortical areas and spectral resolution is better in right auditory cortical areas. We propose that cortical asymmetries might have developed as a general solution to the need to optimize processing of the acoustic environment in both temporal and frequency domains.  相似文献   

8.
It has recently been shown that listeners use systematic differences in vowel length and intonation to resolve ambiguities between onset-matched simple words (Davis, Marslen-Wilson, & Gaskell, 2002; Salverda, Dahan, & McQueen, 2003). The present study shows that listeners also use prosodic information in the speech signal to optimize morphological processing. The precise acoustic realization of the stem provides crucial information to the listener about the morphological context in which the stem appears and attenuates the competition between stored inflectional variants. We argue that listeners are able to make use of prosodic information, even though the speech signal is highly variable within and between speakers, by virtue of the relative invariance of the duration of the onset. This provides listeners with a baseline against which the durational cues in a vowel and a coda can be evaluated. Furthermore, our experiments provide evidence for item-specific prosodic effects.  相似文献   

9.
Research in speech perception has been dominated by a search for invariant properties of the signal that correlate with lexical and sublexical categories. We argue that this search for invariance has led researchers to ignore the perceptual consequences of systematic variation within such categories and that sensitivity to this variation may provide an important source of information for integrating information over time in speech perception. Data from a study manipulating VOT continua in words using an eye-movement paradigm indicate that lexical access shows graded sensitivity to within-category variation in VOT and that this sensitivity has a duration sufficient to be useful for information integration. These data support a model in which the perceptual system integrates information from multiple sources and from the surrounding temporal context using probabilistic cue-weighting mechanisms.  相似文献   

10.
Absolute and relative speech timing were examined in patients suffering from Parkinson's, Huntington's, and Wilson's disease. The task was to speak a standard sentence 10 times, first slowly, and then successively faster up to maximum rate. All patient groups had low maximal speech rates and showed decreased variability of speech rate. The duration of pauses between words was the same as in normals and the relative time structure of the test sentence was basically preserved. For comparison, two cases with nonfluent aphasia had even slower speech rates, large increases in pause duration, and major changes in relative speech timing. The results show the same type of alterations of the temporal organization of speech as those characteristic for rapid alternating limb movements in such patients. They support the view that the speech and skeletomotor systems share common neural control modes despite fundamental biomechanical differences. The common denominator between the speech and the skeletomotor disturbances in basal ganglia diseases may be the undamping and slowing of a fast central oscillator.  相似文献   

11.
Infant directed speech (IDS) is a speech register characterized by simpler sentences, a slower rate, and more variable prosody. Recent work has implicated it in more subtle aspects of language development. Kuhl et al. (1997) demonstrated that segmental cues for vowels are affected by IDS in a way that may enhance development: the average locations of the extreme “point” vowels (/a/, /i/ and /u/) are further apart in acoustic space. If infants learn speech categories, in part, from the statistical distributions of such cues, these changes may specifically enhance speech category learning. We revisited this by asking (1) if these findings extend to a new cue (Voice Onset Time, a cue for voicing); (2) whether they extend to the interior vowels which are much harder to learn and/or discriminate; and (3) whether these changes may be an unintended phonetic consequence of factors like speaking rate or prosodic changes associated with IDS. Eighteen caregivers were recorded reading a picture book including minimal pairs for voicing (e.g., beach/peach) and a variety of vowels to either an adult or their infant. Acoustic measurements suggested that VOT was different in IDS, but not in a way that necessarily supports better development, and that these changes are almost entirely due to slower rate of speech of IDS. Measurements of the vowel suggested that in addition to changes in the mean, there was also an increase in variance, and statistical modeling suggests that this may counteract the benefit of any expansion of the vowel space. As a whole this suggests that changes in segmental cues associated with IDS may be an unintended by-product of the slower rate of speech and different prosodic structure, and do not necessarily derive from a motivation to enhance development.  相似文献   

12.
Developmental research reporting electrophysiological correlates of voice onset time (VOT) during speech perception is reviewed. By two months of age a right hemisphere mechanism appears which differentiates voiced from voiceless stop consonants. This mechanism was found at 4 years of age and again with adults.A new study is described which represents an attempt to determine a more specific basis for VOT perception. Auditory evoked responses (AER) were recorded over the left and right hemispheres while 16 adults attended to repetitive series of two-tone stimuli. Portions of the AERs were found to vary systematically over the two hemispheres in a manner similar to that previously reported for VOT stimuli. These findings are discussed in terms of a temporal detection mechanism which is involved in speech perception.  相似文献   

13.
Timing cues present in the acoustic waveform of speech provide critical information for the recognition and segmentation of the ongoing speech signal. Research has demonstrated that deficient temporal perception rates, that have been shown to specifically disrupt acoustic processing of speech, are related to specific language-based learning impairments (LLI). Temporal processing deficits correlate highly with the phonological discrimination and processing deficits of these children. Electrophysiological single cell mapping studies of sensory cortex in brains of primates have shown that neural circuitry can be remapped after specific, temporally cohesive training regimens, demonstrating the dynamic plasticity of the brain. Recently, we combined these two lines of research in a series of studies that addressed whether the temporal processing deficits seen in LLIs can be significantly modified through adaptive training aimed at reducing temporal integration thresholds. Simultaneously, we developed a computer algorithm that expanded and enhanced the brief, rapidly changing acoustic segments within ongoing speech and used this to provide intensive speech and language training exercises to these children. Results to date from two independent laboratory experiments, as well as a large national clinical efficacy trial, demonstrate that dramatic improvements in temporal integration thresholds, together with speech and language comprehension abilities of LLI children, results from training with these new computer-based training procedures.  相似文献   

14.
We explored the degree to which the duration of acoustic cues contributes to the respective involvement of the two hemispheres in the perception of speech. To this end, we recorded the reaction time needed to identify monaurally presented natural French plosives with varying VOT values. The results show that a right-ear advantage is significant only when the phonetic boundary is close to the release burst, i.e., when the identification of the two successive acoustical events (the onset of voicing and the release from closure) needed to perceive a phoneme as voiced or voiceless requires rapid information processing. These results are consistent with the recent hypothesis that the left hemisphere is superior in the processing of rapidly changing acoustical information.  相似文献   

15.
Across languages, children with developmental dyslexia have a specific difficulty with the neural representation of the sound structure (phonological structure) of speech. One likely cause of their difficulties with phonology is a perceptual difficulty in auditory temporal processing (Tallal, 1980). Tallal (1980) proposed that basic auditory processing of brief, rapidly successive acoustic changes is compromised in dyslexia, thereby affecting phonetic discrimination (e.g. discriminating /b/ from /d/) via impaired discrimination of formant transitions (rapid acoustic changes in frequency and intensity). However, an alternative auditory temporal hypothesis is that the basic auditory processing of the slower amplitude modulation cues in speech is compromised (Goswami et al., 2002). Here, we contrast children's perception of a synthetic speech contrast (ba/wa) when it is based on the speed of the rate of change of frequency information (formant transition duration) versus the speed of the rate of change of amplitude modulation (rise time). We show that children with dyslexia have excellent phonetic discrimination based on formant transition duration, but poor phonetic discrimination based on envelope cues. The results explain why phonetic discrimination may be allophonic in developmental dyslexia (Serniclaes et al., 2004), and suggest new avenues for the remediation of developmental dyslexia.  相似文献   

16.
It is widely accepted that duration can be exploited as phonological phrase final lengthening in the segmentation of a novel language, i.e., in extracting discrete constituents from continuous speech. The use of final lengthening for segmentation and its facilitatory effect has been claimed to be universal. However, lengthening in the world languages can also mark lexically stressed syllables. Stress-induced lengthening can potentially be in conflict with right edge phonological phrase boundary lengthening. Thus the processing of durational cues in segmentation can be dependent on the listener’s linguistic background, e.g., on the specific correlates and unmarked location of lexical stress in the native language of the listener. We tested this prediction and found that segmentation by both German and Basque speakers is facilitated when lengthening is aligned with the word final syllable and is not affected by lengthening on either the penultimate or the antepenultimate syllables. Lengthening of the word final syllable, however, does not help Italian and Spanish speakers to segment continuous speech, and lengthening of the antepenultimate syllable impedes their performance. We have also found a facilitatory effect of penultimate lengthening on segmentation by Italians. These results confirm our hypothesis that processing of lengthening cues is not universal, and interpretation of lengthening as a phonological phrase final boundary marker in a novel language of exposure can be overridden by the phonology of lexical stress in the native language of the listener.  相似文献   

17.
18.
Tasks assessing perception of a phonemic contrast based on voice onset time (VOT) and a nonspeech analog of a VOT contrast using tone onset time (TOT) were administered to children (ages 7.5 to 15.9 years) identified as having reading disability (RD; n = 21), attention deficit/hyperactivity disorder (ADHD; n = 22), comorbid RD and ADHD (n = 26), or no impairment (NI; n = 26). Children with RD, whether they had been identified as having ADHD or not, exhibited reduced perceptual skills on both tasks as indicated by shallower slopes on category labeling functions and reduced accuracy even at the endpoints of the series where cues are most salient. Correlations between performance on the VOT task and measures of single word decoding and phonemic awareness were significant only in the groups without ADHD. These findings suggest that (a) children with RD have difficulty in processing speech and nonspeech stimuli containing similar auditory temporal cues, (b) phoneme perception is related to phonemic awareness and decoding skills, and (c) the potential presence of ADHD needs to be taken into account in studies of perception in children with RD.  相似文献   

19.
The present study represents a contemporary test of traditional assumptions about sex effects in social interaction. An experiment was conducted to assess the independent and interactive effects of communicator sex, listener sex, and interpersonal distance on temporal measures of conversational interaction. Results demonstrated that the average duration of speech acts was significantly longer for females than for males; that communicators, regardless of sex, speak for a greater proportion of the total conversation when the listener is female as opposed to male; and that within the sex-same male dyads, far interpersonal distance is associated with significantly greater simultaneous speech when compared to the near condition. Results are interpreted as refutation for traditional notions of male dominance.  相似文献   

20.
The ability to interpret vocal (prosodic) cues during social interactions can be disrupted by Parkinson's disease, with notable effects on how emotions are understood from speech. This study investigated whether PD patients who have emotional prosody deficits exhibit further difficulties decoding the attitude of a speaker from prosody. Vocally inflected but semantically nonsensical ‘pseudo‐utterances’ were presented to listener groups with and without PD in two separate rating tasks. Task 1 required participants to rate how confident a speaker sounded from their voice and Task 2 required listeners to rate how polite the speaker sounded for a comparable set of pseudo‐utterances. The results showed that PD patients were significantly less able than HC participants to use prosodic cues to differentiate intended levels of speaker confidence in speech, although the patients could accurately detect the polite/impolite attitude of the speaker from prosody in most cases. Our data suggest that many PD patients fail to use vocal cues to effectively infer a speaker's emotions as well as certain attitudes in speech such as confidence, consistent with the idea that the basal ganglia play a role in the meaningful processing of prosodic sequences in spoken language ( Pell & Leonard, 2003 ).  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号