共查询到20条相似文献,搜索用时 15 毫秒
1.
Since the experiments of Saffran et al. [Saffran, J., Aslin, R., & Newport, E. (1996). Statistical learning in 8-month-old infants. Science, 274, 1926-1928], there has been a great deal of interest in the question of how statistical regularities in the speech stream might be used by infants to begin to identify individual words. In this work, we use computational modeling to explore the effects of different assumptions the learner might make regarding the nature of words - in particular, how these assumptions affect the kinds of words that are segmented from a corpus of transcribed child-directed speech. We develop several models within a Bayesian ideal observer framework, and use them to examine the consequences of assuming either that words are independent units, or units that help to predict other units. We show through empirical and theoretical results that the assumption of independence causes the learner to undersegment the corpus, with many two- and three-word sequences (e.g. what’s that, do you, in the house) misidentified as individual words. In contrast, when the learner assumes that words are predictive, the resulting segmentation is far more accurate. These results indicate that taking context into account is important for a statistical word segmentation strategy to be successful, and raise the possibility that even young infants may be able to exploit more subtle statistical patterns than have usually been considered. 相似文献
2.
Çağrı Çöltekin 《Cognitive Science》2017,41(7):1988-2021
This study investigates a strategy based on predictability of consecutive sub‐lexical units in learning to segment a continuous speech stream into lexical units using computational modeling and simulations. Lexical segmentation is one of the early challenges during language acquisition, and it has been studied extensively through psycholinguistic experiments as well as computational methods. However, despite strong empirical evidence, the explicit use of predictability of basic sub‐lexical units in models of segmentation is underexplored. This paper presents an incremental computational model of lexical segmentation for exploring the usefulness of predictability for lexical segmentation. We show that the predictability cue is a strong cue for segmentation. Contrary to earlier reports in the literature, the strategy yields state‐of‐the‐art segmentation performance with an incremental computational model that uses only this particular cue in a cognitively plausible setting. The paper also reports an in‐depth analysis of the model, investigating the conditions affecting the usefulness of the strategy. 相似文献
3.
Cross‐situational word learning, like any statistical learning problem, involves tracking the regularities in the environment. However, the information that learners pick up from these regularities is dependent on their learning mechanism. This article investigates the role of one type of mechanism in statistical word learning: competition. Competitive mechanisms would allow learners to find the signal in noisy input and would help to explain the speed with which learners succeed in statistical learning tasks. Because cross‐situational word learning provides information at multiple scales—both within and across trials/situations—learners could implement competition at either or both of these scales. A series of four experiments demonstrate that cross‐situational learning involves competition at both levels of scale, and that these mechanisms interact to support rapid learning. The impact of both of these mechanisms is considered from the perspective of a process‐level understanding of cross‐situational learning. 相似文献
4.
Many studies have shown that listeners can segment words from running speech based on conditional probabilities of syllable transitions, suggesting that this statistical learning could be a foundational component of language learning. However, few studies have shown a direct link between statistical segmentation and word learning. We examined this possible link in adults by following a statistical segmentation exposure phase with an artificial lexicon learning phase. Participants were able to learn all novel object-label pairings, but pairings were learned faster when labels contained high probability (word-like) or non-occurring syllable transitions from the statistical segmentation phase than when they contained low probability (boundary-straddling) syllable transitions. This suggests that, for adults, labels inconsistent with expectations based on statistical learning are harder to learn than consistent or neutral labels. In contrast, a previous study found that infants learn consistent labels, but not inconsistent or neutral labels. 相似文献
5.
This paper reconsiders the diphone-based word segmentation model of Cairns, Shillcock, Chater, and Levy (1997) and Hockema (2006), previously thought to be unlearnable. A statistically principled learning model is developed using Bayes' theorem and reasonable assumptions about infants' implicit knowledge. The ability to recover phrase-medial word boundaries is tested using phonetic corpora derived from spontaneous interactions with children and adults. The (unsupervised and semi-supervised) learning models are shown to exhibit several crucial properties. First, only a small amount of language exposure is required to achieve the model's ceiling performance, equivalent to between 1 day and 1 month of caregiver input. Second, the models are robust to variation, both in the free parameter and the input representation. Finally, both the learning and baseline models exhibit undersegmentation, argued to have significant ramifications for speech processing as a whole. 相似文献
6.
Children acquiring languages with noun classes (grammatical gender) have ample statistical information available that characterizes the distribution of nouns into these classes, but their use of this information to classify novel nouns differs from the predictions made by an optimal Bayesian classifier. We use rational analysis to investigate the hypothesis that children are classifying nouns optimally with respect to a distribution that does not match the surface distribution of statistical features in their input. We propose three ways in which children's apparent statistical insensitivity might arise, and find that all three provide ways to account for the difference between children's behavior and the optimal classifier. A fourth model combines two of these proposals and finds that children's insensitivity is best modeled as a bias to ignore certain features during classification, rather than an inability to encode those features during learning. These results provide insight into children's developing knowledge of noun classes and highlight the complex ways in which statistical information from the input interacts with children's learning processes. 相似文献
7.
Children show a remarkable degree of consistency in learning some words earlier than others. What patterns of word usage predict variations among words in age of acquisition? We use distributional analysis of a naturalistic corpus of child-directed speech to create quantitative features representing natural variability in word contexts. We evaluate two sets of features: One set is generated from the distribution of words into frames defined by the two adjacent words. These features primarily encode syntactic aspects of word usage. The other set is generated from non-adjacent co-occurrences between words. These features encode complementary thematic aspects of word usage. Regression models using these distributional features to predict age of acquisition of 656 early-acquired English words indicate that both types of features improve predictions over simpler models based on frequency and appearance in salient or simple utterance contexts. Syntactic features were stronger predictors of children's production than comprehension, whereas thematic features were stronger predictors of comprehension. Overall, earlier acquisition was predicted by features representing frames that select for nouns and verbs, and by thematic content related to food and face-to-face play topics; later acquisition was predicted by features representing frames that select for pronouns and question words, and by content related to narratives and object play. 相似文献
8.
Balancing Effort and Information Transmission During Language Acquisition: Evidence From Word Order and Case Marking 下载免费PDF全文
Across languages of the world, some grammatical patterns have been argued to be more common than expected by chance. These are sometimes referred to as (statistical) language universals. One such universal is the correlation between constituent order freedom and the presence of a case system in a language. Here, we explore whether this correlation can be explained by a bias to balance production effort and informativity of cues to grammatical function. Two groups of learners were presented with miniature artificial languages containing optional case marking and either flexible or fixed constituent order. Learners of the flexible order language used case marking significantly more often. This result parallels the typological correlation between constituent order flexibility and the presence of case marking in a language and provides a possible explanation for the historical development of Old English to Modern English, from flexible constituent order with case marking to relatively fixed order without case marking. In addition, learners of the flexible order language conditioned case marking on constituent order, using more case marking with the cross‐linguistically less frequent order, again mirroring typological data. These results suggest that some cross‐linguistic generalizations originate in functionally motivated biases operating during language learning. 相似文献
9.
Visual Similarity of Words Alone Can Modulate Hemispheric Lateralization in Visual Word Recognition: Evidence From Modeling Chinese Character Recognition 下载免费PDF全文
In Chinese orthography, the most common character structure consists of a semantic radical on the left and a phonetic radical on the right (SP characters); the minority, opposite arrangement also exists (PS characters). Recent studies showed that SP character processing is more left hemisphere (LH) lateralized than PS character processing. Nevertheless, it remains unclear whether this is due to phonetic radical position or character type frequency. Through computational modeling with artificial lexicons, in which we implement a theory of hemispheric asymmetry in perception but do not assume phonological processing being LH lateralized, we show that the difference in character type frequency alone is sufficient to exhibit the effect that the dominant type has a stronger LH lateralization than the minority type. This effect is due to higher visual similarity among characters in the dominant type than the minority type, demonstrating the modulation of visual similarity of words on hemispheric lateralization. 相似文献
10.
Prior research has shown that people can learn many nouns (i.e., word–object mappings) from a short series of ambiguous situations containing multiple words and objects. For successful cross‐situational learning, people must approximately track which words and referents co‐occur most frequently. This study investigates the effects of allowing some word‐referent pairs to appear more frequently than others, as is true in real‐world learning environments. Surprisingly, high‐frequency pairs are not always learned better, but can also boost learning of other pairs. Using a recent associative model (Kachergis, Yu, & Shiffrin, 2012), we explain how mixing pairs of different frequencies can bootstrap late learning of the low‐frequency pairs based on early learning of higher frequency pairs. We also manipulate contextual diversity, the number of pairs a given pair appears with across training, since it is naturalistically confounded with frequency. The associative model has competing familiarity and uncertainty biases, and their interaction is able to capture the individual and combined effects of frequency and contextual diversity on human learning. Two other recent word‐learning models do not account for the behavioral findings. 相似文献
11.
Learning General Phonological Rules From Distributional Information: A Computational Model 下载免费PDF全文
Phonological rules create alternations in the phonetic realizations of related words. These rules must be learned by infants in order to identify the phonological inventory, the morphological structure, and the lexicon of a language. Recent work proposes a computational model for the learning of one kind of phonological alternation, allophony (Peperkamp, Le Calvez, Nadal, & Dupoux, 2006). This paper extends the model to account for learning of a broader set of phonological alternations and the formalization of these alternations as general rules. In Experiment 1, we apply the original model to new data in Dutch and demonstrate its limitations in learning nonallophonic rules. In Experiment 2, we extend the model to allow it to learn general rules for alternations that apply to a class of segments. In Experiment 3, the model is further extended to allow for generalization by context; we argue that this generalization must be constrained by linguistic principles. 相似文献
12.
Recent laboratory experiments have shown that both infant and adult learners can acquire word-referent mappings using cross-situational statistics. The vast majority of the work on this topic has used unfamiliar objects presented on neutral backgrounds as the visual contexts for word learning. However, these laboratory contexts are much different than the real-world contexts in which learning occurs. Thus, the feasibility of generalizing cross-situational learning beyond the laboratory is in question. Adapting the Human Simulation Paradigm, we conducted a series of experiments examining cross-situational learning from children's egocentric videos captured during naturalistic play. Focusing on individually ambiguous naming moments that naturally occur during toy play, we asked how statistical learning unfolds in real time through accumulating cross-situational statistics in naturalistic contexts. We found that even when learning situations were individually ambiguous, learners’ performance gradually improved over time. This improvement was driven in part by learners’ use of partial knowledge acquired from previous learning situations, even when they had not yet discovered correct word-object mappings. These results suggest that word learning is a continuous process by means of real-time information integration. 相似文献
13.
We examined electrophysiological correlates of conscious change detection versus change blindness for equivalent displays. Observers had to detect any changes, across a visual interruption, between a pair of successive displays. Each display comprised grey circles on a background of alternate black and white stripes. Foreground changes arose when light-grey circles turned dark-grey and vice-versa. Physically stronger background changes arose when all black stripes turned white and vice-versa. Despite their physical strength, background changes were undetected unless attention was directed to them, whereas foreground changes were invariably seen. Event-related potentials revealed that the P300 component was suppressed for unseen background changes, as compared with the same changes when seen. This effect arose first over frontal sites, and then spread to parietal sites. These results extend recent fMRI findings that fronto-parietal activation is associated with conscious visual change detection, to reveal the timing of these neural correlates. 相似文献
14.
15.
While many constraints on learning must be relatively experience-independent, past experience provides a rich source of guidance for subsequent learning. Discovering structure in some domain can inform a learner’s future hypotheses about that domain. If a general property accounts for particular sub-patterns, a rational learner should not stipulate separate explanations for each detail without additional evidence, as the general structure has “explained away” the original evidence. In a grammar-learning experiment using tone sequences, manipulating learners’ prior exposure to a tone environment affects their sensitivity to the grammar-defining feature, in this case consecutive repeated tones. Grammar-learning performance is worse if context melodies are “smooth” — when small intervals occur more than large ones — as Smoothness is a general property accounting for a high rate of repetition. We present an idealized Bayesian model as a “best case” benchmark for learning repetition grammars. When context melodies are Smooth, the model places greater weight on the small-interval constraint, and does not learn the repetition rule as well as when context melodies are not Smooth, paralleling the human learners. These findings support an account of abstract grammar-induction in which learners rationally assess the statistical evidence for underlying structure based on a generative model of the environment. 相似文献
16.
Statistical learning refers to the ability to identify structure in the input based on its statistical properties. For many linguistic structures, the relevant statistical features are distributional: They are related to the frequency and variability of exemplars in the input. These distributional regularities have been suggested to play a role in many different aspects of language learning, including phonetic categories, using phonemic distinctions in word learning, and discovering non‐adjacent relations. On the surface, these different aspects share few commonalities. Despite this, we demonstrate that the same computational framework can account for learning in all of these tasks. These results support two conclusions. The first is that much, and perhaps all, of distributional statistical learning can be explained by the same underlying set of processes. The second is that some aspects of language can be learned due to domain‐general characteristics of memory. 相似文献
17.
Jessica F. Hay Bruna Pelucchi Katharine Graf Estes Jenny R. Saffran 《Cognitive psychology》2011,(2):93-106
The processes of infant word segmentation and infant word learning have largely been studied separately. However, the ease with which potential word forms are segmented from fluent speech seems likely to influence subsequent mappings between words and their referents. To explore this process, we tested the link between the statistical coherence of sequences presented in fluent speech and infants’ subsequent use of those sequences as labels for novel objects. Notably, the materials were drawn from a natural language unfamiliar to the infants (Italian). The results of three experiments suggest that there is a close relationship between the statistics of the speech stream and subsequent mapping of labels to referents. Mapping was facilitated when the labels contained high transitional probabilities in the forward and/or backward direction (Experiment 1). When no transitional probability information was available (Experiment 2), or when the internal transitional probabilities of the labels were low in both directions (Experiment 3), infants failed to link the labels to their referents. Word learning appears to be strongly influenced by infants’ prior experience with the distribution of sounds that make up words in natural languages. 相似文献
18.
19.
Florencia Reali 《Cognition》2009,111(3):317-328
The regularization of linguistic structures by learners has played a key role in arguments for strong innate constraints on language acquisition, and has important implications for language evolution. However, relating the inductive biases of learners to regularization behavior in laboratory tasks can be challenging without a formal model. In this paper we explore how regular linguistic structures can emerge from language evolution by iterated learning, in which one person’s linguistic output is used to generate the linguistic input provided to the next person. We use a model of iterated learning with Bayesian agents to show that this process can result in regularization when learners have the appropriate inductive biases. We then present three experiments demonstrating that simulating the process of language evolution in the laboratory can reveal biases towards regularization that might not otherwise be obvious, allowing weak biases to have strong effects. The results of these experiments suggest that people tend to regularize inconsistent word-meaning mappings, and that even a weak bias towards regularization can allow regular languages to be produced via language evolution by iterated learning. 相似文献
20.
Natural languages contain many layers of sequential structure, from the distribution of phonemes within words to the distribution of phrases within utterances. However, most research modeling language acquisition using artificial languages has focused on only one type of distributional structure at a time. In two experiments, we investigated adult learning of an artificial language that contains dependencies between both adjacent and non‐adjacent words. We found that learners rapidly acquired both types of regularities and that the strength of the adjacent statistics influenced learning of both adjacent and non‐adjacent dependencies. Additionally, though accuracy was similar for both types of structure, participants’ knowledge of the deterministic non‐adjacent dependencies was more explicit than their knowledge of the probabilistic adjacent dependencies. The results are discussed in the context of current theories of statistical learning and language acquisition. 相似文献