首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Imitations of 15 synthesized vowels, some like English vowels and some not, were obtained from nine adults and ten 6-year-old children. Estimates of the first three formant frequencies (F 1 , F 2 , and F 3 ) were made from spectrograms of the vowel imitations. The reliability of reproduction was assessed by calculating standard deviations for five imitations each of ten of the synthetic stimuli. Generally, both the intrasubject and intersubject variabilities were greater for the children than for the adults. However, the differences in intrasubject variability between the two groups often were not much greater than the difference in measurement error which would be expected for voices of different fundamental frequencies. Subjects tended to reproduce the nonEnglish vowels less reliably than the English vowels, although the adults were less influenced by phonetic familiarity than were the children. Vowel familiarity appeared to be especially important for reliable reporoduction of the F 2 frequency. Plotting of the imitation data for English vowels in a F 1 –F 2 plane with linear dimensions revealed a fairly systematic clustering for the four age-sex groups of men, women, boys, and girls, but the group clustering was not so systematic for the imitation data for the nonEnglish vowels.This work was supported by Public Health Service Research Grant NS-HD-12281.  相似文献   

2.
Multidimensional scaling (MDS) was used to compare perceptual maps for 10 synthetic English vowels in humans and Old World monkeys (Macaca fuscata andCercopithecus albogularis). Subjects discriminated among the vowels using a repeating background procedure, and reaction times were submitted to an MDS analysis to derive measures of perceived similarity. The dimensions that emerged related to the frequencies of the first(F1), second(F2), and third(F3) formants. Human data indicated a good match to previous MDS studies using rating procedures or confusion matrices: The dominant dimension mapped onto vowelF2, the phonetically most important formant, and the second and third dimensions mapped ontoF1 andF3, respectively.For monkeys, equal weightings occurred forF1 andF2, andF3 was not clearly represented. Monkey sensitivity to the formants appeared to relate to formant amplitudes. If monkeys are giving an accurate representation of the psychoacoustic relations among the formants, then our human results suggest that species-specific mechanisms, reflecting the salience of the phonetic feature of advancement, may contribute to vowel coding in humans.  相似文献   

3.
Three experiments investigated whether extrinsic vowel normalization takes place largely at a categorical or a precategorical level of processing. Traditional vowel normalization effects in categorization were replicated in Experiment 1: Vowels taken from an [?]–[ε] continuum were more often interpreted as /?/ (which has a low first formant, F 1) when the vowels were heard in contexts that had a raised F 1 than when the contexts had a lowered F 1. This was established with contexts that consisted of only two syllables. These short contexts were necessary for Experiment 2, a discrimination task that encouraged listeners to focus on the perceptual properties of vowels at a precategorical level. Vowel normalization was again found: Ambiguous vowels were more easily discriminated from an endpoint [ε] than from an endpoint [?] in a high-F 1 context, whereas the opposite was true in a low-F 1 context. Experiment 3 measured discriminability between pairs of steps along the [?]–[ε] continuum. Contextual influences were again found, but without discrimination peaks, contrary to what was predicted from the same participants’ categorization behavior. Extrinsic vowel normalization therefore appears to be a process that takes place at least in part at a precategorical processing level.  相似文献   

4.
Two experiments investigating the selective adaptation of vowels examined changes in listeners’ identification functions for the vowel continuum [i-I-∈] as a function of the adapting stimulus. In Experiment I, the adapting stimuli were [i], [I], and [∈]. Both the [i] and [∈] stimuli produced significant shifts in the neighboringand distant phonetic boundaries, whereas [I] did not result in any adaptation effects. In order to explore the phonetic nature of feature adaptation in vowels, a second experiment was conducted using the adapting stimuli [gig] and [g ∈ g], which differed acoustically from the [i] and [∈] vowels on the identification continuum. Only [gig] yielded reliable adaptation effects. The results of these experiments were interpreted as suggesting arelative rather than a stableauditory mode of feature analysis in vowels and a possibly more complex auditory feature analysis for the vowel [i].  相似文献   

5.
Listeners tune in to talkers’ vowels through extrinsic normalization. We asked here whether this process could be based on compensation for the long-term average spectrum (LTAS) of preceding sounds and whether the mechanisms responsible for normalization are indifferent to the nature of those sounds. If so, normalization should apply to nonspeech stimuli. Previous findings were replicated with first-formant (F1) manipulations of speech. Targets on a [pt]–[p?t] (low–high F1) continuum were labeled as [pt] more after high-F1 than after low-F1 precursors. Spectrally rotated nonspeech versions of these materials produced similar normalization. None occurred, however, with nonspeech stimuli that were less speechlike, even though precursor–target LTAS relations were equivalent to those used earlier. Additional experiments investigated the roles of pitch movement, amplitude variation, formant location, and the stimuli's perceived similarity to speech. It appears that normalization is not restricted to speech but that the nature of the preceding sounds does matter. Extrinsic normalization of vowels is due, at least in part, to an auditory process that may require familiarity with the spectrotemporal characteristics of speech.  相似文献   

6.
It has been demonstrated using the “silent-center” (SC) syllable paradigm that there is sufficient information in syllable onsets and offsets,taken together, to support accurate identification of vowels spoken in both citation-form syllables and syllables spoken in sentence context. Using edited natural speech stimuli, the present study examined the identification of American English vowels when increasing amounts of syllable onsetsalone or syllable offsetsalone were presented in their original sentence context. The stimuli were /d/-vowel-/d/ syllables spoken in a short carrier sentence by a male speaker. Listeners attempted to identify the vowels in experimental conditions that differed in the number of pitch periods presented and whether the pitch periods were from syllable onsets or syllable off-sets. In general, syllable onsets were more informative than syllable offsets, although neither onsets nor offsets alone specified vowel identity as well as onsets and offsets together (SC syllables). Vowels differed widely in ease of identification; the diphthongized long vowels /e/, /ae/, /o/ were especially difficult to identify from syllable offsets. Identification of vowels as “front” or “back” was accurate, even from short samples of the syllable; however, vowel "height" was quite difficult to determine, again, especially from syllable offsets. The results emphasize the perceptual importance of time-varying acoustic parameters, which are the direct consequence of the articulatory dynamics involved in producing syllables.  相似文献   

7.
An interesting phenomenon in human speech perception is the trading relation, in which two different acoustic cues both signal the same phonetic percept. The present study compared American English, Spanish, and monkey listeners in their perception of the trading relation between gap duration andFl transition onset frequency in a syntheticsay-stay continuum. For all the subjects, increased gap duration caused perception to change fromsay tostay; however, subjects differed in the extent to which theFl cue traded with gap duration. For American English listeners, a change from a low to a highF1 onset caused a phoneme boundary shift of 26 msec toward shorter gap durations, indicating a strong trading relation. For Spanish listeners, the shift was significantly smaller at 13.7 msec, indicating a weaker trading relation. For monkeys, there was no shift at all, indicating no trading relation. These results provide evidence that thesay-stay trading relation is dependent on perceptual learning from linguistic exposure.  相似文献   

8.
9.
The ability to compensate for fixation of the jaw by a bite block was investigated in 6 nonfluent aphasics, 6 fluent aphasics, and 10 normal control subjects. Acoustic analyses of the vowels [i u a æ] and fricatives [s s] revealed substantial but incomplete compensation for the perturbation in all three subject groups. Perceptual identification scores and quality ratings by naive and phonetically trained listeners indicated poorer identification of the high vowels [i u] under compensatory conditions relative to normal production. Of particular interest was the fact that all three groups of subjects exhibited similar patterns of results. The findings suggest that any deficit in speech motor programming demonstrated by the nonfluent aphasic patients did not affect compensatory abilities. Results are discussed with respect to normal speech adaptation skills and the nature of articulatory breakdown in nonfluent aphasia.  相似文献   

10.
This paper shows the need to triangulate different approaches in Bilingualism and Second Language Acquisition (SLA) research to fully understand late bilinguals’ interlanguage grammars. Methodologically, we show how experimental and corpus data can be (and should be) triangulated by reporting on a corpus study (Lozano and Mendikoetxea in Biling Lang Cognit 13(4):475–497, 2010) and a new follow-up offline experiment investigating Subject–Verb inversion (Subject–Verb/Verb–Subject order) in L1 Spanish–L2 English (n = 417). Theoretically, we follow a recent line in psycholinguistic approaches to Bilingualism and SLA research (Interface Hypothesis, Sorace in Linguist Approaches Biling 1(1):1–33, 2011). It focuses on the interface between syntax and language-external modules of the mind/brain (syntax-discourse [end-focus principle] and syntax-phonology [end-weight principle]) as well as a language-internal interface (lexicon-syntax [unaccusative hypothesis]). We argue that it is precisely this multi-faceted interface approach (corpus and experimental data, core syntax and the interfaces, representational and processing models) that provides a deeper understanding of (i) the factors that favour inversion in L2 acquisition in particular and (ii) interlanguage grammars in general.  相似文献   

11.
12.
Identifiability of vowels and speakers from whispered syllables   总被引:4,自引:0,他引:4  
In the present experiments, the effect of whisper register on speech perception was measured. We assessed listeners' abilities to identify 10 vowels in [hVd] context pronounced by 3 male and 3 female speakers in normal and whisper registers. Results showed 82% average identification accuracy in whisper mode, approximately a 10% falloff in identification accuracy from normally phonated speech. In both modes, significant confusions of [o] for [a] occurred, with some additional significant confusions occurring in whisper mode among vowels adjacent in F1/F2 space. We also assessed listeners' abilities to match whispered syllables with normally phonated ones by the same speaker. Each trial contained the matching syllable and two foils whispered by speakers of the same sex as the speaker of the target. Identification performance was significantly better than chance across subjects, speakers, and vowels, with no listener achieving better than 96% performance. Acoustic analyses measured potential cues to speaker identity independent of register.  相似文献   

13.
This study investigated how Japanese-speaking learners of English pronounce the three point vowels /i/, /u/, and /a/ appearing in the first and second monosyllabic words of English noun phrases, and the schwa /ə/ appearing in English disyllabic words. First and second formant (F1 and F2) values were measured for four Japanese speakers and two American English speakers. The hypothesis that the area encompassed by the point vowels in the F1-F2 vowel space tends to be smaller for the Japanese speakers than for the English speakers was verified. The hypothesis that the area formed by the three schwas in chicken{{chick\underline{e}n}}, spoonful{{spoonf{\underline{u}}l}}, and Tarzan{{Tarz\underline{a}n}} is greater for the Japanese speakers than for the English speakers and its related hypothesis were largely upheld. Implications for further research are briefly discussed.  相似文献   

14.
Phoneme labeling and discrimination experiments were conducted with a continuum of voiced stops produced by a Terminal Analog Speech Synthesizer. The stops ranged from |b| to |d|. Only second formant (F2) transitions changed from one sound to another. (A formant is energy concentrated in a narrow frequency range.) In the labeling experiment conducted to locate the phoneme boundary, subjects identified the individual stimuli as |b| and |d|. In discrimination, difference and identity pairs were presented, with alternative responses of same and different. This allows separate consideration of discrimination (different/Different) and recognition (same/Identity) hits, and also analysis of the data in accordance with the theory of signal detectibility. The sounds were discriminated with and without F 1 and F 3 which contained no discriminatory information, but are responsible for perceived similarity to speech. With F 1 F 3 , sensitivity (d) was highest at the |b-d| boundary, but without F 1 F 3 this was not true. Spectral analysis of the sounds both with and without F 1 F 3 revealed a phonemic energy discontinuity for the 1/3 octave around the F 2 steady-state frequency (1250 Hz). It therefore seems probable that subjects listened to frequencies which contained phonemic information when F 1 F 3 were included, but not when they were omitted. In spite of the high sensitivity at the |b-d| boundary, recognition hits (same /Identity) were lowest the boundary had to sound less like a difference to be called different than a pair away from the boundary.Indications, then, are quite strong that auditory-frequency selection helps the perception of speech, and it is clear that a strategy of criterion lowering helps it.  相似文献   

15.
16.
For native speakers of English and several other languages, preceding vocalic duration andFi offset frequency are two of the cues that convey the stop consonant voicing distinction in wordfinal position. For speakers learning English as a second language, there are indications that use of vocalic duration, but notFl offset frequency, may be hindered by a lack of experience with phonemic (i.e., lexical) vowel length (the “phonemic vowel length account”: Crowther & Mann, 1992). In this study, native speakers of Arabic, a language that includes a phonemic vowel length distinction, were tested for their use of vocalic duration andF1 offset in production and perception of the English consonant-vowel-consonant forms pod and pot. The phonemic vowel length hypothesis predicts that Arabic speakers should use vocalic duration extensively in production and perception. On the contrary, experiment l repealed that, consistent with Flege and Port’s (1981) findings, they produced only slightly (but significantly) longer vocalic segments in their pod tokens. It further indicated that their productions showed a significant variation inFl offset as a function of final stop voicing. Perceptual sensitivity to vocalic duration andFl offset as voicing cues was tested in two experiments. In experiment 2, we employed a factorial combination of these two cues and a finely spaced vocalic duration continuum. Arabic speakers did not appear to be very sensitive to vocalic duration, but they were abort as sensitive as native English speakers toF1 offset frequency. In Experiment 3, we employed a one-dimensional continuum of more widely spaced stimuli that varied only vocalic duration. Arabic speakers showed native-English-like sensitivity to vocalic duration- Anexplanation based on tie perceptual anchor theory of context coding (Braida et al., 1984; Macmillan, 1987; Macmillan, Braida, & Goldberg, 1987) and phoneme perception theory (Schouten & Van Hessen, 2992) is offered to reconcile the apparently contradictory perceptual findings. The explanation does not attribute native-English-like voicing perception to the Ambit subjects. The findings in this study call fox a modification of the phonemic vowel length hypothesis.  相似文献   

17.
Phonetically governed changes in the fundamental frequency (F0) of vowels that immediately precede and follow voiceless stop plosives have been found to follow consistent patterns in adults and children as young as four years of age. In the present study, F0 onset and offset patterns in 14 children who stutter (CWS) and 14 children who do not stutter (CWNS) were investigated to evaluate differences in speech production. Participants produced utterances containing two VCV sequences. F0 patterns in the last ten vocal cycles in the preceding vowel (voicing offset) and the first ten vocal cycles in the subsequent vowel (voicing onset) were analyzed. A repeated measures ANOVA revealed no group differences between the CWS and CWNS in either voicing onset or offset gestures. Both groups showed patterns of F0 onset and offset that were consistent with the mature patterns seen in children and adults in previous studies. These findings suggest that in both CWS and CWNS, a mature pattern of voicing onset and offset is present by age 3;6. This study suggests that there is no difference between CWS and CWNS in the coordination of respiratory and laryngeal systems during voicing onset or offset.Educational objectives: The reader will be able to: (a) discuss the importance of investigating children who stutter close to the onset of stuttering; (b) describe the typical change in F0 during voicing onset; (c) discuss the potential implications of these results with regard to future research.  相似文献   

18.
Vowels are better identified in a consonantal syllabic context than as isolated vowels. This finding is contrary to predictions from traditional theories of vowel perception. The poor perception of isolated vowels might be attributed to a lack of dynamic acoustic cues or to familiarity effects related to the phonological rules of English. Vowel identification tests were conducted using six talkers, nine vowels, and seven syllabic contexts. Consonantal context improved vowel identification; final consonants aided identification more than initial consonants. No consistent support was found for the effect of phonological rules but duration information was seen to play a critical role. Results constitute a challenge to traditional theories of vowel perception.  相似文献   

19.
Mean squared error of prediction is used as the criterion for determining which of two multiple regression models (not necessarily nested) is more predictive. We show that an unrestricted (or true) model witht parameters should be chosen over a restricted (or misspecified) model withm parameters if (P t 2 ?P m 2 )>(1?P t 2 )(t?m)/n, whereP t 2 andP m 2 are the population coefficients of determination of the unrestricted and restricted models, respectively, andn is the sample size. The left-hand side of the above inequality represents the squared bias in prediction by using the restricted model, and the right-hand side gives the reduction in variance of prediction error by using the restricted model. Thus, model choice amounts to the classical statistical tradeoff of bias against variance. In practical applications, we recommend thatP 2 be estimated by adjustedR 2 . Our recommendation is equivalent to performing theF-test for model comparison, and using a critical value of 2?(m/n); that is, ifF>2?(m/n), the unrestricted model is recommended; otherwise, the restricted model is recommended.  相似文献   

20.
E. A. Peel 《Psychometrika》1946,11(2):129-137
The aesthetic preferences of a group of persons are obtained from their orders of sets of pictures and patterns according to “liking.” The same pictures are ordered independently by a team of experts, according to certain artistic criteria such as naturalism, composition, color, rhythm, etc. The orders of preference and orders according to the criteria are compared by correlation and matrices of correlation formed from (1) correlations between the persons' orders of preference; (2) correlations between the orders of preference and orders according to artistic criteria; and (3) correlations between the criterion orders. These matrices are symbolised byR p ,R 0, andR c , respectively, and combined to form a single matrix
$$\left[ \begin{gathered} R_p R_o \hfill \\ R'_o R_c \hfill \\ \end{gathered} \right]$$  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号