首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Latent semantic analysis (LSA) and transitional probability (TP), two computational methods used to reflect lexical semantic representation from large text corpora, were employed to examine the effects of word predictability on Chinese reading. Participants' eye movements were monitored, and the influence of word complexity (number of strokes), word frequency, and word predictability on different eye movement measures (first-fixation duration, gaze duration, and total time) were examined. We found influences of TP on first-fixation duration and gaze duration and of LSA on total time. The results suggest that TP reflects an early stage of lexical processing while LSA reflects a later stage.  相似文献   

2.
对于语义空间的研究一直是认知心理学研究的一个热点。由于对词汇语义系统的不同观点,科学家们试图从不同的角度采用不同的方法来进行研究。目前,有代表性的语义空间研究方法主要有两种:潜在语义分析(LSA)和语言的多维空间类比(HAL)。潜在语义分析是指利用奇异值分解的方法来探索文章中潜在的语义关系的方法;语言的多维空间类比则是利用多维量表(MDS)的方法来提取语义信息。  相似文献   

3.
Latent semantic analysis (LSA) is a statistical model of word usage that permits comparisons of semantic similarity between pieces of textual information. This paper summarizes three experiments that illustrate how LSA may be used in text-based research. Two experiments describe methods for analyzing a subject’s essay for determining from what text a subject learned the information and for grading the quality of information cited in the essay. The third experiment describes using LSA to measure the coherence and comprehensibility of texts.  相似文献   

4.
Scoring divergent‐thinking response sets has always been challenging because such responses are not only open‐ended in terms of number of ideas, but each idea may also be expressed by a varying number of concepts and, thus, by a varying number of words (elaboration). While many current studies have attempted to score the semantic distance in divergent‐thinking responses by applying latent semantic analysis (LSA), it is known from other areas of research that LSA‐based approaches are biased according to the number of words in a response. Thus, the current article aimed to identify and demonstrate this elaboration bias in LSA‐based divergent‐thinking scores by means of a simulation. In addition, we show that this elaboration bias can be reduced by removing the stop words (for example, and, or, for and so forth) prior to analysis. Furthermore, the residual bias after stop word removal can be reduced by simulation‐based corrections. Finally, we give an empirical illustration for alternate uses and consequences tasks. Results suggest that when both stop word removal and simulation‐based bias correction are applied, convergent validity should be expected to be highest.  相似文献   

5.
The general aim of this study is to validate the cognitive relevance of the geometric model used in the semantic atlases (SA). With this goal in mind, we compare the results obtained by the automatic contexonym organizing model (ACOM)—an SA-derived model for word sense representation based on contextual links—with human subjects’ responses on a word association task. We begin by positioning the geometric paradigm with respect to the hierarchical paradigm (WordNet) and the vector paradigm (latent semantic analysis [LSA] and the hyperspace analogue to language model). Then we compare ACOM’s responses with Hirsh and Tree’s (2001) word association norms based on the responses of two groups of subjects. The results showed that words associated by 50% or more of the Hirsh and Tree subjects were also proposed by ACOM (e.g., 71% of the words in the norms were also given by ACOM). Finally, we compare ACOM and LSA on the basis of the same association norms. The results indicate better performance for the geometric model.  相似文献   

6.
In distributional semantics models (DSMs) such as latent semantic analysis (LSA), words are represented as vectors in a high-dimensional vector space. This allows for computing word similarities as the cosine of the angle between two such vectors. In two experiments, we investigated whether LSA cosine similarities predict priming effects, in that higher cosine similarities are associated with shorter reaction times (RTs). Critically, we applied a pseudo-random procedure in generating the item material to ensure that we directly manipulated LSA cosines as an independent variable. We employed two lexical priming experiments with lexical decision tasks (LDTs). In Experiment 1 we presented participants with 200 different prime words, each paired with one unique target. We found a significant effect of cosine similarities on RTs. The same was true for Experiment 2, where we reversed the prime-target order (primes of Experiment 1 were targets in Experiment 2, and vice versa). The results of these experiments confirm that LSA cosine similarities can predict priming effects, supporting the view that they are psychologically relevant. The present study thereby provides evidence for qualifying LSA cosine similarities not only as a linguistic measure, but also as a cognitive similarity measure. However, it is also shown that other DSMs can outperform LSA as a predictor of priming effects.  相似文献   

7.
We present a longitudinal computational study on the connection between emotional and amodal word representations from a developmental perspective. In this study, children's and adult word representations were generated using the latent semantic analysis (LSA) vector space model and Word Maturity methodology. Some children's word representations were used to set a mapping function between amodal and emotional word representations with a neural network model using ratings from 9-year-old children. The neural network was trained and validated in the child semantic space. Then, the resulting neural network was tested with adult word representations using ratings from an adult data set. Samples of 1210 and 5315 words were used in the child and the adult semantic spaces, respectively. Results suggested that the emotional valence of words can be predicted from amodal vector representations even at the child stage, and accurate emotional propagation was found in the adult word vector representations. In this way, different propagative processes were observed in the adult semantic space. These findings highlight a potential mechanism for early verbal emotional anchoring. Moreover, different multiple linear regression and mixed-effect models revealed moderation effects for the performance of the longitudinal computational model. First, words with early maturation and subsequent semantic definition promoted emotional propagation. Second, an interaction effect between age of acquisition and abstractness was found to explain model performance. The theoretical and methodological implications are discussed.  相似文献   

8.
It is an open question whether social stereotype activation can be distinguished from nonsocial semantic activation. To address this question, gender stereotype activation (GSA) and lexical semantic activation (LSA) were directly compared. EEGs were recorded in 20 participants as they identified the congruence between prime-target word pairs under four different conditions (stereotype congruent, stereotype incongruent, semantic congruent, and semantic incongruent). We found that congruent targets elicited faster and more accurate responses and reduced N400 amplitudes irrespective of priming category types. The N400 congruency effect (i.e., the difference between incongruity and congruity) started earlier and had greater amplitude for GSA than for LSA. Moreover, gender category priming induced a smaller N400 and a larger P600 than lexical category priming. These findings suggest that the brain is not only sensitive to both stereotype and semantic violation in the post-perceptual processing stage but can also differentiate these two information processes. Further, the findings suggest superior processing (i.e., faster and deeper processing) when the words are associated with social category and convey stereotype knowledge.  相似文献   

9.
In this study, we compared four expert graders with latent semantic analysis (LSA) to assess short summaries of an expository text. As is well known, there are technical difficulties for LSA to establish a good semantic representation when analyzing short texts. In order to improve the reliability of LSA relative to human graders, we analyzed three new algorithms by two holistic methods used in previous research (León, Olmos, Escudero, Cañas, &; Salmerón, 2006). The three new algorithms were (1) the semantic common network algorithm, an adaptation of an algorithm proposed by W. Kintsch (2001, 2002) with respect to LSA as a dynamic model of semantic representation; (2) a best-dimension reduction measure of the latent semantic space, selecting those dimensions that best contribute to improving the LSA assessment of summaries (Hu, Cai, Wiemer-Hastings, Graesser, &; McNamara, 2007); and (3) the Euclidean distance measure, used by Rehder et al. (1998), which incorporates at the same time vector length and the cosine measures. A total of 192 Spanish middle-grade students and 6 experts took part in this study. They read an expository text and produced a short summary. Results showed significantly higher reliability of LSA as a computerized assessment tool for expository text when it used a best-dimension algorithm rather than a standard LSA algorithm. The semantic common network algorithm also showed promising results.  相似文献   

10.
Latent semantic analysis (LSA) is a computational model of human knowledge representation that approximates semantic relatedness judgments. Two issues are discussed that researchers must attend to when evaluating the utility of LSA for predicting psychological phenomena. First, the role of semantic relatedness in the psychological process of interest must be understood. LSA indices of similarity should then be derived from this theoretical understanding. Second, the knowledge base (semantic space) from which similarity indices are generated must contain “knowledge” that is appropriate to the task at hand. Proposed solutions are illustrated with data from an experiment in which LSA-based indices were generated from theoretical analysis of the processes involved in understanding two conflicting accounts of a historical event. These indices predict the complexity of subsequent student reasoning about the event, as well as hand-coded predictions generated from think-aloud protocols collected when students were reading the accounts of the event.  相似文献   

11.
Two immediate serial recall experiments were conducted to test the associative-link hypothesis (Stuart & Hulme, 2000). We manipulated interitem association by varying the intralist latent semantic analysis (LSA) cosines in our 7-item study word lists, each of which consists of high- or low-frequency words in Experiment 1 and high- or low-imageability words in Experiment 2. Whether item recall performance was scored by a serial-recall or free-recall criterion, we found main effects of interitem association, word imageability, and word frequency. The effect of interitem association also interacted with the word frequency effect, but not with the word imageability effect. The LSA-cosinexword frequency interaction occurred in the recency, but not primacy, portion of the serial position curve. The present findings set explanatory boundaries for the associative-link hypothesis and we argue that both item- and associative-based mechanisms are necessary to account for the word frequency effect in immediate serial recall.  相似文献   

12.
Latent semantic analysis (LSA) is a model of knowledge representation for words. It works by applying dimension reduction to local co-occurrence data from a large collection of documents after performing singular value decomposition on it. When the reduction is applied, the system forms condensed representations for the words that incorporate higher order associations. The higher order associations are primarily responsible for any semantic similarity between words in LSA. In this article, a memory model is described that creates semantic representations for words that are similar in form to those created by LSA. However, instead of applying dimension reduction, the model builds the representations by using a retrieval mechanism from a well-known account of episodic memory.  相似文献   

13.
In a previous article, we presented a systematic computational study of the extraction of semantic representations from the word-word co-occurrence statistics of large text corpora. The conclusion was that semantic vectors of pointwise mutual information values from very small co-occurrence windows, together with a cosine distance measure, consistently resulted in the best representations across a range of psychologically relevant semantic tasks. This article extends that study by investigating the use of three further factors--namely, the application of stop-lists, word stemming, and dimensionality reduction using singular value decomposition (SVD)--that have been used to provide improved performance elsewhere. It also introduces an additional semantic task and explores the advantages of using a much larger corpus. This leads to the discovery and analysis of improved SVD-based methods for generating semantic representations (that provide new state-of-the-art performance on a standard TOEFL task) and the identification and discussion of problems and misleading results that can arise without a full systematic study.  相似文献   

14.
Predication   总被引:1,自引:0,他引:1  
In Latent Semantic Analysis (LSA) the meaning of a word is represented as a vector in a high-dimensional semantic space. Different meanings of a word or different senses of a word are not distinguished. Instead, word senses are appropriately modified as the word is used in different contexts. In N-VP sentences, the precise meaning of the verb phrase depends on the noun it is combined with. An algorithm is described to adjust the meaning of a predicate as it is applied to different arguments. In forming a sentence meaning, not all features of a predicate are combined with the features of the argument, but only those that are appropriate to the argument. Hence, a different "sense" of a predicate emerges every time it is used in a different context. This predication algorithm is explored in the context of four different semantic problems: metaphor interpretation, causal inferences, similarity judgments, and homonym disambiguation.  相似文献   

15.
The effectiveness of a domain-specific latent semantic analysis (LSA) in assessing reading strategies was examined. Students were given self-explanation reading training (SERT) and asked to think aloud after each sentence in a science text. Novice and expert human raters and two LSA spaces (general reading, science) rated the similarity of each think-aloud protocol to benchmarks representing three different reading strategies (minimal, local, and global). The science LSA space correlated highly with human judgments, and more highly than did the general reading space. Also, cosines from the science LSA spaces can distinguish between different levels of semantic similarity, but may have trouble in distinguishing local processing protocols. Thus, a domain-specific LSA space is advantageous regardless of the size of the space. The results are discussedin the context of applying the science LSA to a computer-based version of SERT that gives online feedback based on LSA cosines.  相似文献   

16.
We explored methods of using latent semantic analysis (LSA) to identify reading strategies in students’ self-explanations that are collected as part of a Web-based reading trainer. In this study, college students self-explained scientific texts, one sentence at a time. LSA was used to measure the similarity between the self-explanations andsemantic benchmarks (groups of words and sentences that together represent reading strategies). Three types of semantic benchmarks were compared: content words, exemplars, and strategies. Discriminant analyses were used to classify global and specific reading strategies using the LSA cosines. All benchmarks contributed to the classification of general reading strategies, but the exemplars did the best in distinguishing subtle semantic differences between reading strategies. Pragmatic and theoretical concerns of using LSA are discussed.  相似文献   

17.
We explored methods of using latent semantic analysis (LSA) to identify reading strategies in students' self-explanations that are collected as part of a Web-based reading trainer. In this study, college students self-explained scientific texts, one sentence at a time. ISA was used to measure the similarity between the self-explanations and semantic benchmarks (groups of words and sentences that together represent reading strategies). Three types of semantic benchmarks were compared: content words, exemplars, and strategies. Discriminant analyses were used to classify global and specific reading strategies using the LSA cosines. All benchmarks contributed to the classification of general reading strategies, but the exemplars did the best in distinguishing subtle semantic differences between reading strategies. Pragmatic and theoretical concerns of using LSA are discussed.  相似文献   

18.
The current experiment investigated how sentential form-class expectancies influenced lexical-semantic priming within each hemisphere. Sentences were presented that led readers to expect a noun or a verb and the sentence-final target word was presented to one visual field/hemisphere for a lexical decision response. Noun and verb targets in the semantically related condition were compared to an unrelated prime condition, which also predicted part of speech but did not contain any lexical-semantic associates of the target word. The semantic priming effect was strongly modulated by form-class expectancy for RVF/LH targets, for both nouns and verbs. In the LVF/RH, semantic priming was obtained in all conditions, regardless of whether the form-class expectancy was violated. However, the nouns that were preceded by a noun-predicting sentence showed an extremely high priming value in the LVF/RH, suggesting that the RH may have some sensitivity to grammatical predictions for nouns. Comparisons of LVF/RH priming to calculations derived from the LSA model of language representation, which does not utilize word order, suggested that the RH might derive message-level meaning primarily from lexical-semantic relatedness.  相似文献   

19.
Empirical studies indicate that analogy consists of two main processes: retrieval and mapping. While current theories and models of analogy have revealed much about the mainly structural constraints that govern the mapping process, the similarities that underpin the correspondences between individual representational elements and drive retrieval are understood in less detail. In existing models symbol similarities are externally defined but neither empirically grounded nor theoretically justified. This paper introduces a new model (EMMA: the environmental model of analogy) which relies on co‐occurrence information provided by LSA (Latent Semantic Analysis; Landauer & Dumais, 1997) to ground the relations between the symbolic elements aligned in analogy. LSA calculates a contextual distribution for each word encountered in a corpus by counting the frequency with which it co‐occurs with other words. This information is used to define a model that locates each word encountered in a high‐dimensional space, with relations between elements in this space representing contextual similarities between words. A series of simulation experiments demonstrate that the environmental approach to semantics embodied in LSA can produce appropriate patterns of analogical retrieval, but that this semantic measure is not sufficient to model analogical mapping. The implications of these findings, both for theories of representation in analogy research and more general theories of semantics in cognition, are explored.  相似文献   

20.
This article presents the current state of a work in progress, whose objective is to better understand the effects of factors that significantly influence the performance of latent semantic analysis (LSA). A difficult task, which consisted of answering (French) biology multiple choice questions, was used to test the semantic properties of the truncated singular space and to study the relative influence of the main parameters. A dedicated software was designed to fine-tune the LSA semantic space for the multiple choice questions task. With optimal parameters, the performances of our simple model were quite surprisingly equal or superior to those of seventh- and eighthgrade students. This indicates that semantic spaces were quite good despite their low dimensions and the small sizes of the training data sets. In addition, we present an original entropy global weighting of the answers’ terms for each of the multiple choice questions, which was necessary to achieve the model’s success.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号