Statistical learning allows listeners to track transitional probabilities among syllable sequences and use these probabilities for subsequent speech segmentation. Recent studies have shown that other sources of information, such as rhythmic cues, can modulate the dependencies extracted via statistical computation. In this study, we explored how syllables made salient by a pitch rise affect the segmentation of trisyllabic words from an artificial speech stream by native speakers of three different languages (Spanish, English, and French). Results showed that, whereas performance of French participants did not significantly vary across stress positions (likely due to language-specific rhythmic characteristics), the segmentation performance of Spanish and English listeners was unaltered when syllables in word-initial and word-final positions were salient, but it dropped to chance level when salience was on the medial syllable. We argue that pitch rise in word-medial syllables draws attentional resources away from word boundaries, thus decreasing segmentation effectiveness.  相似文献   

First and second language acquisition both require that speech be segmented into familiar, multiphonemic units (e.g., words and common phrases). The present research examines one segmentation cue that is of considerable theoretical interest: the repetition of fixed sequences of speech. On each trial, subjects heard repetitions ('pre-exposures') of two artificially-constructed, multisyllabic patterns that shared an embedded segment 1 or 2 syllables long (e.g., 2 shared syllables: [ga-li-SE] and [li-SE-stu]). There were 2 and 6, 4 and 4, or 6 and 2 repetitions of the two patterns, randomly ordered. Subjects were then to indicate the groupings they perceived within a subsequent, longer sequence containing both of the pre-exposed patterns (e.g., [ga-li-SE-stu]). Responses varied systematically with the size of the embedded segment, the repetition frequencies of the two pre-exposed patterns, and the serial position of each pre-exposure. The results illustrate how investigations of the processing of speech patterns may contribute to an understanding of some elementary aspects of language learning.  相似文献   

词是语言的基本结构单位, 对词语进行切分是语言加工的重要步骤。口语语流中的切分线索来自于语音、语义和语法三个方面。语音线索包括概率信息、音位配列规则和韵律信息, 韵律信息中还包括词重音、时长和音高等内容, 这些线索的使用在接触语言的早期阶段就逐渐被个体所掌握, 而且在不同的语言背景下有一定的特异性。语法和语义线索属于较高级的线索机制, 主要作用于词语切分过程的后期。后续研究应从语言的毕生发展和语言的特异性两个方面考察口语语言加工中的词语切分线索。   

Before infants can learn words, they must identify those words in continuous speech. Yet, the speech signal lacks obvious boundary markers, which poses a potential problem for language acquisition (Swingley, Philos Trans R Soc Lond. Series B, Biol Sci 364 (1536), 3617–3632, 2009). By the middle of the first year, infants seem to have solved this problem (Bergelson & Swingley, Proc Natl Acad Sci 109 (9), 3253–3258, 2012; Jusczyk & Aslin, Cogn Psychol 29 , 1–23, 1995), but it is unknown if segmentation abilities are present from birth, or if they only emerge after sufficient language exposure and/or brain maturation. Here, in two independent experiments, we looked at two cues known to be crucial for the segmentation of human speech: the computation of statistical co‐occurrences between syllables and the use of the language's prosody. After a brief familiarization of about 3 min with continuous speech, using functional near‐infrared spectroscopy, neonates showed differential brain responses on a recognition test to words that violated either the statistical (Experiment 1) or prosodic (Experiment 2) boundaries of the familiarization, compared to words that conformed to those boundaries. Importantly, word recognition in Experiment 2 occurred even in the absence of prosodic information at test, meaning that newborns encoded the phonological content independently of its prosody. These data indicate that humans are born with operational language processing and memory capacities and can use at least two types of cues to segment otherwise continuous speech, a key first step in language acquisition.  相似文献   

In two artificial language learning experiments, we investigated the impact of attention load on segmenting speech through two sublexical cues: transitional probabilities (TPs) and coarticulation. In Experiment 1, we observed that coarticulation processing was resilient to high attention load, whereas TP computation was penalized in a graded manner. In Experiment 2, we showed that encouraging participants to actively search for “word” candidates enhanced overall performance but was not sufficient to preclude the impairment of statistically driven segmentation by attention load. As long as attentional resources were depleted, independently of their intention to find these “words,” participants segmented only TP words with the highest TPs, not TP words with lower TPs. Attention load thus has a graded and differential impact on the relative weighting of the cues in speech segmentation, even when only sublexical cues are available in the signal.  相似文献   

The present study investigated how well individuals knowledgeable about stuttering are able to make disfluency judgments in clients who speak another language than their own. Fourteen native speakers of Brazilian Portuguese identified and judged stuttering in Dutch speakers and in Portuguese speakers. Fourteen native speakers of Dutch identified and judged stuttering in Brazilian Portuguese speakers and in Dutch speakers. It was found that judges can make similar level of judgment in a native and a foreign language, and that native and foreign judges can make similar level of judgment irrespective of native/foreign differences. It was also found, however, that the Dutch judges performed significantly better in identifying native stutterers than foreign stutterers. And for the identification of nonstutterers, both panels performed better in their native language than in the foreign language, and in their native language they both performed better than the other panel. Both the Brazilian Portuguese and the Dutch speaking panel were generally also less confident, and found identification of stuttering more difficult in the foreign language than in the native language. In addition, when asked for the characteristics that helped them identify stutterers, they provided more detail in the native language than in the foreign language. Also a number of differences were found between the two panels which may be due to differences in training or cultural background. The implications of the findings for clinical practice and for future research in this area are discussed. EDUCATIONAL OBJECTIVES: The reader will be able to: (1) describe how language influences the identification of a speech disorder such as stuttering, and (2) list, and (3) define behaviors that help to identify stuttering in a foreign language.  相似文献   

It is widely accepted that duration can be exploited as phonological phrase final lengthening in the segmentation of a novel language, i.e., in extracting discrete constituents from continuous speech. The use of final lengthening for segmentation and its facilitatory effect has been claimed to be universal. However, lengthening in the world languages can also mark lexically stressed syllables. Stress-induced lengthening can potentially be in conflict with right edge phonological phrase boundary lengthening. Thus the processing of durational cues in segmentation can be dependent on the listener’s linguistic background, e.g., on the specific correlates and unmarked location of lexical stress in the native language of the listener. We tested this prediction and found that segmentation by both German and Basque speakers is facilitated when lengthening is aligned with the word final syllable and is not affected by lengthening on either the penultimate or the antepenultimate syllables. Lengthening of the word final syllable, however, does not help Italian and Spanish speakers to segment continuous speech, and lengthening of the antepenultimate syllable impedes their performance. We have also found a facilitatory effect of penultimate lengthening on segmentation by Italians. These results confirm our hypothesis that processing of lengthening cues is not universal, and interpretation of lengthening as a phonological phrase final boundary marker in a novel language of exposure can be overridden by the phonology of lexical stress in the native language of the listener.  相似文献   

Four experiments (total N = 391) examined predictions derived from a biologically based incentive salience theory of approach motivation. In all experiments, judgments indicative of enhanced perceptual salience were exaggerated in the context of positive, relative to neutral or negative, stimuli. In Experiments 1 and 2, positive words were judged to be of a larger size (Experiment 1) and led individuals to judge subsequently presented neutral objects as larger in size (Experiment 2). In Experiment 3, similar effects were observed in a mock subliminal presentation paradigm. In Experiment 4, positive word primes were perceived to have been presented for a longer duration of time, again relative to both neutral and negative word primes. Results are discussed in relation to theories of approach motivation, affective priming, and the motivation-perception interface.  相似文献   

A mental rotation task with unfamiliar stimuli was presented to 5-year-old (n = 36) and 8-year-old children (n = 36). The stimuli contained either an intrinsic salient axis (S +) or no salient axis (s-). Results showed that both 5-year-old and 8-year-old children were able to perform the mental rotation task with the S + stimulus. However, 5-year-old children had difficulties in performing the mental rotation task with the S- stimulus. These results suggest that there are limitations in mental rotation abilities of young children. The ability to encode and mentally rotate unfamiliar stimuli containing no salient axis seems to improve between the ages of 5 and 8.  相似文献   

We investigated the effects of linguistic experience and language familiarity on the perception of audio-visual (A-V) synchrony in fluent speech. In Experiment 1, we tested a group of monolingual Spanish- and Catalan-learning 8-month-old infants to a video clip of a person speaking Spanish. Following habituation to the audiovisually synchronous video, infants saw and heard desynchronized clips of the same video where the audio stream now preceded the video stream by 366, 500, or 666 ms. In Experiment 2, monolingual Catalan and Spanish infants were tested with a video clip of a person speaking English. Results indicated that in both experiments, infants detected a 666 and a 500 ms asynchrony. That is, their responsiveness to A-V synchrony was the same regardless of their specific linguistic experience or familiarity with the tested language. Compared to previous results from infant studies with isolated audiovisual syllables, these results show that infants are more sensitive to A-V temporal relations inherent in fluent speech. Furthermore, the absence of a language familiarity effect on the detection of A-V speech asynchrony at eight months of age is consistent with the broad perceptual tuning usually observed in infant response to linguistic input at this age.  相似文献   

Individual variability in infant's language processing is partly explained by environmental factors, like the quantity of parental speech input, as well as by infant‐specific factors, like speech production. Here, we explore how these factors affect infant word segmentation. We used an artificial language to ensure that only statistical regularities (like transitional probabilities between syllables) could cue word boundaries, and then asked how the quantity of parental speech input and infants’ babbling repertoire predict infants’ abilities to use these statistical cues. We replicated prior reports showing that 8‐month‐old infants use statistical cues to segment words, with a preference for part‐words over words (a novelty effect). Crucially, 8‐month‐olds with larger novelty effects had received more speech input at 4 months and had greater production abilities at 8 months. These findings establish for the first time that the ability to extract statistical information from speech correlates with individual factors in infancy, like early speech experience and language production. Implications of these findings for understanding individual variability in early language acquisition are discussed.  相似文献   

There is strong evidence of shared acoustic profiles common to the expression of emotions in music and speech, yet relatively limited understanding of the specific psychoacoustic features involved. This study combined a controlled experiment and computational modelling to investigate the perceptual codes associated with the expression of emotion in the acoustic domain. The empirical stage of the study provided continuous human ratings of emotions perceived in excerpts of film music and natural speech samples. The computational stage created a computer model that retrieves the relevant information from the acoustic stimuli and makes predictions about the emotional expressiveness of speech and music close to the responses of human subjects. We show that a significant part of the listeners’ second-by-second reported emotions to music and speech prosody can be predicted from a set of seven psychoacoustic features: loudness, tempo/speech rate, melody/prosody contour, spectral centroid, spectral flux, sharpness, and roughness. The implications of these results are discussed in the context of cross-modal similarities in the communication of emotion in the acoustic domain.  相似文献   

Previous research suggests that infant speech perception reorganizes in the first year: young infants discriminate both native and non‐native phonetic contrasts, but by 10–12 months difficult non‐native contrasts are less discriminable whereas performance improves on native contrasts. In the current study, four experiments tested the hypothesis that, in addition to the influence of native language experience, acoustic salience also affects the perceptual reorganization that takes place in infancy. Using a visual habituation paradigm, two nasal place distinctions that differ in relative acoustic salience, acoustically robust labial‐alveolar [ma]–[na] and acoustically less salient alveolar‐velar [na]–[?a], were presented to infants in a cross‐language design. English‐learning infants at 6–8 and 10–12 months showed discrimination of the native and acoustically robust [ma]–[na] (Experiment 1), but not the non‐native (in initial position) and acoustically less salient [na]–[?a] (Experiment 2). Very young (4–5‐month‐old) English‐learning infants tested on the same native and non‐native contrasts also showed discrimination of only the [ma]–[na] distinction (Experiment 3). Filipino‐learning infants, whose ambient language includes the syllable‐initial alveolar (/n/)–velar (/?/) contrast, showed discrimination of native [na]–[?a] at 10–12 months, but not at 6–8 months (Experiment 4). These results support the hypothesis that acoustic salience affects speech perception in infancy, with native language experience facilitating discrimination of an acoustically similar phonetic distinction [na]–[?a]. We discuss the implications of this developmental profile for a comprehensive theory of speech perception in infancy.  相似文献   

Two experiments investigated the importance of spatial and surface cues in the age-processing of unfamiliar faces aged between one and 80 years. Three manipulations known to affect face recognition were used, individually and in various combinations: inversion, negation, and blurring. Faces were presented either in whole or in part. Age-estimation performance was largely unaffected by most of these manipulations; age-processing appears to be a highly robust process, due to the numerous cues available. Experiment 1 showed that, in contrast to face recognition, age-perception appears to be substantially unimpaired by inversion or negation. Experiment 2 suggests that age-estimates can be made on the basis of either surface information (the 2D disposition of the internal facial features, together with texture information) or shape information (head-shape plus feature configuration, as long as shape-from-shading information is present).  相似文献   

This article explores young infants' ability to learn new words in situations providing tightly controlled social and salience cues to their reference. Four experiments investigated whether, given two potential referents, 15-month-olds would attach novel labels to (a) an image toward which a digital recording of a face turned and gazed, (b) a moving image versus a stationary image, (c) a moving image toward which the face gazed, and (d) a gazed-on image versus a moving image. Infants successfully used the recorded gaze cue to form new word-referent associations and also showed learning in the salience condition. However, their behavior in the salience condition and in the experiments that followed suggests that, rather than basing their judgments of the words' reference on the mere presence or absence of the referent's motion, infants were strongly biased to attend to the consistency with which potential referents moved when a word was heard.  相似文献   

