首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 0 毫秒
The idea that at least some aspects of word meaning can be induced from patterns of word co-occurrence is becoming increasingly popular. However, there is less agreement about the precise computations involved, and the appropriate tests to distinguish between the various possibilities. It is important that the effect of the relevant design choices and parameter values are understood if psychological models using these methods are to be reliably evaluated and compared. In this article, we present a systematic exploration of the principal computational possibilities for formulating and validating representations of word meanings from word co-occurrence statistics. We find that, once we have identified the best procedures, a very simple approach is surprisingly successful and robust over a range of psychologically relevant evaluation measures.  相似文献   

Notable progress has been made recently on computational models of semantics using vector representations for word meaning (Mikolov, Chen, Corrado, & Dean, 2013; Mikolov, Sutskever, Chen, Corrado, & Dean, 2013). As representations of meaning, recent models presumably hone in on plausible organizational principles for meaning. We performed an analysis on the organization of the skip-gram model’s semantic space. Consistent with human performance (Osgood, Suci, & Tannenbaum, 1957), the skip-gram model primarily relies on affective distinctions to organize meaning. We showed that the skip-gram model accounts for unique variance in behavioral measures of lexical access above and beyond that accounted for by affective and lexical measures. We also raised the possibility that word frequency predicts behavioral measures of lexical access due to the fact that word use is organized by semantics. Deconstruction of the semantic representations in semantic models has the potential to reveal organizing principles of human semantics.  相似文献   

Producing high-dimensional semantic spaces from lexical co-occurrence   总被引:1,自引:0,他引:1  
A procedure that processes a corpus of text and produces numeric vectors containing information about its meanings for each word is presented. This procedure is applied to a large corpus of natural language text taken from Usenet, and the resulting vectors are examined to determine what information is contained within them. These vectors provide the coordinates in a high-dimensional space in which word relationships can be analyzed. Analyses of both vector similarity and multidimensional scaling demonstrate that there is significant semantic information carried in the vectors. A comparison of vector similarity with human reaction times in a single-word priming experiment is presented. These vectors provide the basis for a representational model of semantic memory, hyperspace analogue to language (HAL).  相似文献   

Identifying important segments in textual data seems to be an important area of research for various applications including topic modelling, trend detection, summarization and event detection. In existing research work, different metrics have been studied to analyse the word co-occurrence network. This research work contributes towards non-semantic and an unsupervised topic identification using the word co-occurrence networks. In this research work, keyphrase have been identified by preserving the lexical sequence using a directed and weighted word co-occurrence network. Further AHP (Analytic Hierarchy Process) model based upon four significant attributes of the word co-occurrence networks have been proposed to rank the keyphrases. Most frequently occurring segment is identified as an influential segment. Experimental results proved high effectiveness of the proposed approach. Results for the First Story Detection, 72 Twitter TDT, synthesized Rio Olympics dataset have been discussed to demonstrate its potential in precisely discovering influential segments.  相似文献   

Lexical co-occurrence models of semantic memory represent word meaning by vectors in a high-dimensional space. These vectors are derived from word usage, as found in a large corpus of written text. Typically, these models are fully automated, an advantage over models that represent semantics that are based on human judgments (e.g., feature-based models). A common criticism of co-occurrence models is that the representations are not grounded: Concepts exist only relative to each other in the space produced by the model. It has been claimed that feature-based models offer an advantage in this regard. In this article, we take a step toward grounding a cooccurrence model. A feed-forward neural network is trained using back propagation to provide a mapping from co-occurrence vectors to feature norms collected from subjects. We show that this network is able to retrieve the features of a concept from its co-occurrence vector with high accuracy and is able to generalize this ability to produce an appropriate list of features from the co-occurrence vector of a novel concept.  相似文献   

Assessments of lexical acquisition are often limited to preschool children on forced‐choice comprehension measures. This study assessed the nature of the understandings 30 school‐age children (mean age = 6;7) acquired about the science term eclipse following a naturalistic exposure to a solar eclipse. The knowledge children acquired about eclipses and a control term comet was assessed at three points in time (baseline‐test, 2‐week post‐test and 5‐month post‐test) using a range of assessment tasks (multiple‐choice comprehension, picture‐naming, drawing and a model solar system manipulation task). Children's knowledge at the baseline‐test and 2‐week post‐test was compared with that of 15 adult controls. The analysis focused on the range of knowledge children acquired about eclipses and the relationships between aspects of knowledge they acquired. We found that children acquired extensive knowledge about eclipses, but not comets. At the 2‐week post‐test, the majority of children were able to produce the term eclipse and provided evidence of accurate comprehension and wider conceptual knowledge about solar eclipses, which was retained at the 5‐month post‐test. Further, children's ability to produce the term was related to their acquisition of ‘rich’ semantic and conceptual knowledge.  相似文献   

The status of semantic conceptual structures in aphasia was investigated with relation to naming disorders in spontaneous and constrained speech production. A battery of six tasks was administered to 25 control subjects and 25 aphasics: spontaneous speech production (from which the percentage of nouns was calculated), confrontation naming, understanding class relationships (verbal and pictorial), and understanding thematic relationships (verbal and pictorial). Results indicated the important role of taxonomic abilities for naming, while other conceptual structures (i.e., thematic relations) do not seem to play any important role in the process of naming. These results are discussed in terms of the internal organization of semantic information.This work was supported by grants to Drs. Semenza and Bisiacchi from the Ministero della Pubblica Istruzione and by the Consiglio Nazionale delle Ricerche Unita 14, Scienza del Comportamento.  相似文献   

Words differ considerably in the amount of associated semantic information. Despite the crucial role of meaning in language, it is still unclear whether and how this variability modulates language learning. Here, we provide initial evidence demonstrating that implicit learning in repetition priming is influenced by the amount of semantic features associated with a given word. Electroencephalographic recordings were obtained while participants performed a visual lexical decision task; the complete stimulus set was repeated once. Repetition priming effects on performance accuracy and the N400 component of the event-related brain potential were enhanced for words with many semantic features. These findings suggest a novel and important impact of the richness of semantic representations on learning and plasticity within the lexical-conceptual system; they are discussed in their relevance for assumptions concerning basic mechanisms underlying word learning.  相似文献   

Human ratings of valence, arousal, and dominance are frequently used to study the cognitive mechanisms of emotional attention, word recognition, and numerous other phenomena in which emotions are hypothesized to play an important role. Collecting such norms from human raters is expensive and time consuming. As a result, affective norms are available for only a small number of English words, are not available for proper nouns in English, and are sparse in other languages. This paper investigated whether affective ratings can be predicted from length, contextual diversity, co-occurrences with words of known valence, and orthographic similarity to words of known valence, providing an algorithm for estimating affective ratings for larger and different datasets. Our bootstrapped ratings achieved correlations with human ratings on valence, arousal, and dominance that are on par with previously reported correlations across gender, age, education and language boundaries. We release these bootstrapped norms for 23,495 English words.  相似文献   

Current controversy exists regarding the role of episodic representations in the formation of long-term semantic memories. Using the drug midazolam to induce temporary amnesia we tested participants' memories for newly learned facts in a semantic cue condition or an episodic and semantic cue condition. Following midazolam administration, memory performance was superior in the episodic and semantic condition, suggesting early semantic learning is supported by episodic representations.  相似文献   

We examine production of word definitions by people with probable Alzheimer's disease (pAD). In the first experiment, healthy young adults defined concrete, imageable nouns to provide a baseline of definitional ability. Analysis of these definitions identified the key defining features of each target item. In the second experiment, pAD participants and elderly controls produced definitions of the same items. In the third experiment, healthy young participants rated the adequacies of these definitions. Although as expected the pAD participants produced fewer good definitions than the other two groups, most of their responses still contained some relevant information. pAD definitions contained fewer pieces of information and the information they produced was more tangential to the primary concept than that provided by the young or elderly participants. We identify two possible explanations in semantic loss and metalinguistic impairment. We consider metalinguistic impairment to provide the more plausible explanation of pAD patients' definitional performance.  相似文献   

Some alternative hypotheses about the recognition of ambiguous words are considered. According to the selective-access hypothesis, prior semantic context biases people to access one meaning of an ambiguous word rather than another in lexical memory during recognition. In contrast, the nonselectiveaccess hypothesis states that all meanings of the word are accessed regardless of the context. We tested certain versions of these hypotheses by having students decide whether selected strings of letters were English words. The stimuli included test sequnces of three words in which the second word had two distinct possible meanings, whereas the first and third words were related to these meanings in various ways. When the first and third words were related to the same meaning of the ambiguous second word (e.g., SAVE-BANK-MONEY), the reaction time to recognize the third word decreased. But when the first and third words were related to different meanings of the second word (e.g., RIVER-BANK-MONEY), the reaction time for the third word was not reliably different from a control sequence with unrelated words. These and other data favor the selective-access hypothesis. Selective access to lexical memory is discussed in relation to models of word recognition.  相似文献   

We explore the adequacy of two types of similarity representation in the context of semantic concepts. To this end, we evaluate different categorization models, assuming either a geometric or a featural representation, using categorization decisions involving familiar and unfamiliar foods and animals. The study aims to assess the optimal stimulus representation as a function of the familiarity of the stimuli. For the unfamiliar stimuli, the geometric categorization models provide the best account of the categorization data, whereas for the familiar stimuli, the featural categorization models provide the best account. This pattern of results suggests that people rely on perceptual information to assign an unfamiliar stimulus to a category but rely on more elaborate conceptual knowledge when assigning a familiar stimulus.  相似文献   

A key goal for cognitive neuroscience is to understand the neurocognitive systems that support semantic memory. Recent multivariate analyses of neuroimaging data have contributed greatly to this effort, but the rapid development of these novel approaches has made it difficult to track the diversity of findings and to understand how and why they sometimes lead to contradictory conclusions. We address this challenge by reviewing cognitive theories of semantic representation and their neural instantiation. We then consider contemporary approaches to neural decoding and assess which types of representation each can possibly detect. The analysis suggests why the results are heterogeneous and identifies crucial links between cognitive theory, data collection, and analysis that can help to better connect neuroimaging to mechanistic theories of semantic cognition.  相似文献   

There is a growing body of research in psychology that attempts to extrapolate human lexical judgments from computational models of semantics. This research can be used to help develop comprehensive norm sets for experimental research, it has applications to large-scale statistical modelling of lexical access and has broad value within natural language processing and sentiment analysis. However, the value of extrapolated human judgments has recently been questioned within psychological research. Of primary concern is the fact that extrapolated judgments may not share the same pattern of statistical relationship with lexical and semantic variables as do actual human judgments; often the error component in extrapolated judgments is not psychologically inert, making such judgments problematic to use for psychological research. We present a new methodology for extrapolating human judgments that partially addresses prior concerns of validity. We use this methodology to extrapolate human judgments of valence, arousal, dominance, and concreteness for 78,286 words. We also provide resources for users to extrapolate these human judgments for three million English words and short phrases. Applications for large sets of extrapolated human judgments are demonstrated and discussed.  相似文献   

Shlomo Bentin   《Brain and language》1987,31(2):308-327
Electrophysiological activity was recorded at 16 scalp locations during a word recognition task in order to investigate the effect of expectancy factors on ERPs. In each of 160 trials two stimuli (S1 and S2) were presented with a stimulus onset asynchrony (SOA) of 1500 msec. There were four experimental conditions. In the word-antonym (W-A) and the word-nonantonym (W-NA) conditions, both S1 and S2 were words. The subjects' task was to think of the antonym to S1 and respond as fast as possible after the presentation of S2 by pressing a "YES" button if S2 was an antonym to S1 (in the W-A trials), or a "NO" button if S2 was not an antonym to S1 (in the W-NA trials). In the nonword-word (NW-W) and nonword-nonword (NW-NW) conditions S1 was a nonword, while S2 was either a word (in NW-W trials) or a nonword (in NW-NW trials). If S1 was not a word, the subjects were instructed to wait for S2, and respond as fast as possible by pressing the "YES" button if it was a word an the "NO" button if it was not a word. EEG was sampled during a time epoch that started 100 msec before the onset of S1 and continued for another 2560 msec. The ERPs were analyzed separately for each experimental condition and for time epochs related to S1, to S2, and to the SOA. Expected antonyms were recognized significantly faster than any other words or nonwords. The RTs to words in the W-NA and NW-W condition, and to nonwords in the NW-NW condition did not differ significantly from each other. The ERP difference between the four conditions following S2 was interpreted in terms of a negative-going potential which appeared prior to the P300, during a time period which started 200 msec and ended 550 msec from stimulus onset. The negativity related to nonwords was significantly larger than the negativity related to words. The negativity related to the expected antonym was almost nonexistent. It is speculated that this negativity has the same origin as N400, and that it might be related to the process of lexical access.  相似文献   

In this article, we introduce a software package that applies a corpus-based algorithm to derive semantic representations of words. The algorithm relies on analyses of contextual information extracted from a text corpus—specifically, analyses of word co-occurrences in a large-scale electronic database of text. Here, a target word is represented as the combination of the average of all words preceding the target and all words following it in a text corpus. The semantic representation of the target words can be further processed by a self-organizing map (SOM; Kohonen, Self-organizing maps, 2001), an unsupervised neural network model that provides efficient data extraction and representation. Due to its topography-preserving features, the SOM projects the statistical structure of the context onto a 2-D space, such that words with similar meanings cluster together, forming groups that correspond to lexically meaningful categories. Such a representation system has its applications in a variety of contexts, including computational modeling of language acquisition and processing. In this report, we present specific examples from two languages (English and Chinese) to demonstrate how the method is applied to extract the semantic representations of words.  相似文献   

The present experiment investigated semantic information extraction in parafoveal word perception. An ambiguous word (bank) was presented in foveal vision, and simultaneously a disambiguating word (water, money) was presented in the parafovea. Subjects were required to make a forced choice between two phrases, and the task was constructed so that a correct choice could be made if semantic information about both the foveal and parafoveal word had been obtained. However, the results indicated that the forced-choice results could be explained by two factors: identification of the parafoveal word and correct guessing. Hence, it was concluded that those models of reading which rely on unconscious semantic preprocessing of parafoveal words were not supported.  相似文献   

The authors examined the processing of phonological and orthographic word representations among 17 dyslexic and 16 normal college-level readers using Event-Related Potential measures. They focused on 2 early components--the P200 and the P300. The results revealed P200 and P300 components of lower amplitude and later latency among dyslexic readers than among normal readers for both types of word representation. Group differences were greatest for phonological representations. In addition, the authors observed greater time gaps among dyslexic readers than among normal readers between different processing stages (i.e., between P2 and P3 peaks, between P3 and reaction time). Combined, the data suggest a consistent speed-of-processing deficit among dyslexic readers that is evident within and between stages of cognitive processing. The results are discussed in the context of deficits in stimulus encoding and working memory. In addition, the authors discuss the need for accurate timing and synchronization of phonological and orthographic codes for efficient word recognition.  相似文献   

We sought to establish whether novel words can become integrated into existing semantic networks by teaching participants new meaningful words and then using these new words as primes in two semantic priming experiments, in which participants carried out a lexical decision task to familiar words. Importantly, at no point in training did the novel words co-occur with the familiar words that served as targets in the primed lexical decision task, allowing us to evaluate semantic priming in the absence of direct association. We found that familiar words were primed by the newly related novel words, both when the novel word prime was unmasked (Experiment 1) and when it was masked (Experiment 2), suggesting that the new words had been integrated into semantic memory. Furthermore, this integration was strongest after a 1-week delay and was independent of explicit recall of the novel word meanings: Forgetting of meanings did not attenuate priming. We argue that even after brief training, newly learned words become an integrated part of the adult mental lexicon rather than being episodically represented separately from the lexicon.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号