首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
In this article, we present a new lexical database for French:Lexique. In addition to classical word information such as gender, number, and grammatical category,Lexique includes a series of interesting new characteristics. First, word frequencies are based on two cues: a contemporary corpus of texts and the number of Web pages containing the word. Second, the database is split into a graphemic table with all the relevant frequencies, a table structured around lemmas (particularly interesting for the study of the inflectional family), and a table about surface frequency cues. Third,Lexique is distributed under a GNU-like license, allowing people to contribute to it. Finally, a metasearch engine,Open Lexique, has been developed so that new databases can be added very easily to the existing ones.Lexique can either be downloaded or interrogated freely fromhttp://www.lexique.org.  相似文献   

2.
Word frequency is the most important variable in research on word processing and memory. Yet, the main criterion for selecting word frequency norms has been the availability of the measure, rather than its quality. As a result, much research is still based on the old Kučera and Francis frequency norms. By using the lexical decision times of recently published megastudies, we show how bad this measure is and what must be done to improve it. In particular, we investigated the size of the corpus, the language register on which the corpus is based, and the definition of the frequency measure. We observed that corpus size is of practical importance for small sizes (depending on the frequency of the word), but not for sizes above 16–30 million words. As for the language register, we found that frequencies based on television and film subtitles are better than frequencies based on written sources, certainly for the monosyllabic and bisyllabic words used in psycholinguistic research. Finally, we found that lemma frequencies are not superior to word form frequencies in English and that a measure of contextual diversity is better than a measure based on raw frequency of occurrence. Part of the superiority of the latter is due to the words that are frequently used as names. Assembling a new frequency norm on the basis of these considerations turned out to predict word processing times much better than did the existing norms (including Kučera & Francis and Celex). The new SUBTL frequency norms from the SUBTLEXUS corpus are freely available for research purposes from http://brm.psychonomic-journals.org/content/supplemental, as well as from the University of Ghent and Lexique Web sites.  相似文献   

3.
We present a new database of Dutch word frequencies based on film and television subtitles, and we validate it with a lexical decision study involving 14,000 monosyllabic and disyllabic Dutch words. The new SUBTLEX frequencies explain up to 10% more variance in accuracies and reaction times (RTs) of the lexical decision task than the existing CELEX word frequency norms, which are based largely on edited texts. As is the case for English, an accessibility measure based on contextual diversity explains more of the variance in accuracy and RT than does the raw frequency of occurrence counts. The database is freely available for research purposes and may be downloaded from the authors’ university site at http://crr.ugent.be/subtlex-nl or from http://brm psychonomic-journals.org/content/supplemental.  相似文献   

4.
In this article, we present Procura-PALavras (P-PAL), a Web-based interface for a new European Portuguese (EP) lexical database. Based on a contemporary printed corpus of over 227 million words, P-PAL provides a broad range of word attributes and statistics, including several measures of word frequency (e.g., raw counts, per-million word frequency, logarithmic Zipf scale), morpho-syntactic information (e.g., parts of speech [PoSs], grammatical gender and number, dominant PoS, and frequency and relative frequency of the dominant PoS), as well as several lexical and sublexical orthographic (e.g., number of letters; consonant–vowel orthographic structure; density and frequency of orthographic neighbors; orthographic Levenshtein distance; orthographic uniqueness point; orthographic syllabification; and trigram, bigram, and letter type and token frequencies), and phonological measures (e.g., pronunciation, number of phonemes, stress, density and frequency of phonological neighbors, transposed and phonographic neighbors, syllabification, and biphone and phone type and token frequencies) for ~53,000 lemmatized and ~208,000 nonlemmatized EP word forms. To obtain these metrics, researchers can choose between two word queries in the application: (i) analyze words previously selected for specific attributes and/or lexical and sublexical characteristics, or (ii) generate word lists that meet word requirements defined by the user in the menu of analyses. For the measures it provides and the flexibility it allows, P-PAL will be a key resource to support research in all cognitive areas that use EP verbal stimuli. P-PAL is freely available at http://p-pal.di.uminho.pt/tools.  相似文献   

5.
荆伟  方俊明  赵微 《心理学报》2014,46(3):385-395
本文利用眼动追踪技术在基线、一致和矛盾3种实验条件下考察感知觉线索和社会性线索在自闭症谱系障碍儿童词语习得中的相对作用。行为数据结果表明, 此类儿童在矛盾条件下选择枯燥物体作为新异词语的所指对象, 这说明社会性线索较之于感知觉线索具有优势作用; 而他们在基线和一致条件下选择有趣物体作为新异词语的所指对象, 且一致条件的词语习得成绩优于基线条件, 这说明社会性线索较之于感知觉线索具有促进作用。眼动数据结果表明, 此类儿童在脸部注视模式和视线追随行为上与普通儿童存在差异。这说明, 虽然社会性线索在此类儿童与普通儿童的词语习得中具有相同的相对作用, 但他们获取社会性信息的方式与普通儿童存在差异。  相似文献   

6.
In this article, we present a new lexical database for Modern Standard Arabic: Aralex. Based on a contemporary text corpus of 40 million words, Aralex provides information about (1) the token frequencies of roots and word patterns, (2) the type frequency, or family size, of roots and word patterns, and (3) the frequency of bigrams, trigrams in orthographic forms, roots, and word patterns. Aralex will be a useful tool for studying the cognitive processing of Arabic through the selection of stimuli on the basis of precise frequency counts. Researchers can use it as a source of information on natural language processing, and it may serve an educational purpose by providing basic vocabulary lists. Aralex is distributed under a GNU-like license, allowing people to interrogate it freely online or to download it from www.mrc-cbu.cam.ac.uk:8081/aralex .online/login.jsp.  相似文献   

7.
This article presents MANULEX, a Web-accessible database that provides grade-level word frequency lists of nonlemmatized and lemmatized words (48,886 and 23,812 entries, respectively) computed from the 1.9 million words taken from 54 French elementary school readers. Word frequencies are provided for four levels: first grade (G1), second grade (G2), third to fifth grades (G3-5), and all grades (G1-5). The frequencies were computed following the methods described by Carroll, Davies, and Richman (1971) and Zeno, Ivenz, Millard, and Duvvuri (1995), with four statistics at each level (F, overall word frequency; D, index of dispersion across the selected readers; U, estimated frequency per million words; and SFI, standard frequency index). The database also provides the number of letters in the word and syntactic category information. MANULEX is intended to be a useful tool for studying language development through the selection of stimuli based on precise frequency norms. Researchers in artificial intelligence can also use it as a source of information on natural language processing to simulate written language acquisition in children. Finally, it may serve an educational purpose by providing basic vocabulary lists.  相似文献   

8.
This article presents MANULEX, a Web-accessible database that provides grade-level word frequency lists of nonlemmatized and lemmatized words (48,886 and 23,812 entries, respectively) computed from the 1.9 million words taken from 54 French elementary school readers. Word frequencies are provided for four levels: first grade (G1), second grade (G2), third to fifth grades (G3-5), and all grades (G1-5). The frequencies were computed following the methods describedby Carroll, Davies, and Richman (1971) and Zeno, Ivenz, Millard, and Duwuri (1995), with four statistics at each level (F, overall word frequency;D, index of dispersion across the selectedreaders;U, estimated frequencyper million words; andSFI, standard frequency index). The database also provides the number of letters in the word and syntactic category information. MANULEX is intended to be a useful tool for studying language development through the selection of stimuli based on precise frequency norms. Researchers in artificial intelligence can also use it as a source of information on natural language processing to simulate written language acquisition in children. Finally, it may serve an educational purpose by providing basic vocabulary lists.  相似文献   

9.
WordGen is an easy-to-use program that uses the CELEX and Lexique lexical databases for word selection and nonword generation in Dutch, English, German, and French. Items can be generated in these four languages, specifying any combination of seven linguistic constraints: number of letters, neighborhood size, frequency, summated position-nonspecific bigram frequency, minimum position-nonspecific bigram f requency, position-specific frequency of the initial and final bigram, and orthographic relatedness. The program also has a module to calculate the respective values of these variables for items that have already been constructed, either with the program or taken from earlier studies. Stimulus queries can be entered through WordGen's graphical user interface or by means of batch files. WordGen is especially useful for (1) Dutch and German item generation, because no such stimulus-selection tool exists for these languages, (2) the generation of nonwords for all four languages, because our program has some important advantages over previous nonword generation approaches, and (3) psycholinguistic experiments on bilingualism, because the possibility of using the same tool for different languages increases the cross-linguistic comparability of the generated item lists. WordGen is free and available at http://expsy.ugent.be/wordgen.htm.  相似文献   

10.
This paper analyses some aspects of the eye movement behaviour of readers of Thai and Chinese. The main focus is on readers′landing site distributions on words and how these are affected by the lack of clear word boundary information due to the absence of inter-word spaces. Empirical evidence from Thai and Chinese readers suggest that readers can relatively accurately target word centres. We make the case that this accuracy can be accounted for by a default targetting model(effectively, the prior landing site distribution, in Bayesian terms)modulated by statistical cues about word beginnings available from word-initial character frequencies.  相似文献   

11.
An important source of information about a new word's meaning (and its associated lexical class) is its range of reference: the number of objects to which it is extended. Ninety toddlers (mean age = 37 months) participated in a study to determine whether young children can use this information in word learning. When a novel word was presented with unambiguous lexical class cues as either a proper name (i.e. 'His name is DAXY') or an adjective (i.e. 'He is very DAXY'), toddlers interpreted it appropriately, regardless of whether it was applied to one or both members of a pair of identical-looking stuffed animals. They restricted a proper name to the designated animal(s); but they generalized an adjective from the labeled animal(s) to a new animal bearing the same property. However, when the word was presented with no specific lexical class cues (i.e. 'DAXY'), toddlers made significantly different interpretations, depending on the number of referents. When the word was applied to one animal, they restricted it to that animal (consistent with a proper name interpretation); when the word was applied to two animals, they generalized it to a new animal with the property (consistent with an adjective or a restricted count noun interpretation). Range-of-reference information thus provided toddlers with a default cue to the meaning (and associated lexical class) of a novel word.  相似文献   

12.
于文勃  梁丹丹 《心理科学进展》2018,26(10):1765-1774
词是语言的基本结构单位, 对词语进行切分是语言加工的重要步骤。口语语流中的切分线索来自于语音、语义和语法三个方面。语音线索包括概率信息、音位配列规则和韵律信息, 韵律信息中还包括词重音、时长和音高等内容, 这些线索的使用在接触语言的早期阶段就逐渐被个体所掌握, 而且在不同的语言背景下有一定的特异性。语法和语义线索属于较高级的线索机制, 主要作用于词语切分过程的后期。后续研究应从语言的毕生发展和语言的特异性两个方面考察口语语言加工中的词语切分线索。  相似文献   

13.
14.
WordGen is an easy-to-use program that uses the CELEX and Lexique lexical databases for word selection and nonword generation in Dutch, English, German, and French. Items can be generated in these four languages, specifying any combination of seven linguistic constraints: number of letters, neighborhood size, frequency, summated position-nonspecific bigram frequency, minimum position-nonspecific bigram frequency, position-specific frequency of the initial and final bigram, and orthographic relatedness. The program also has a module to calculate the respective values of these variables for items that have already been constructed, either with the program or taken from earlier studies. Stimulus queries can be entered through WordGen’s graphical user interface or by means of batch files. WordGen is especially useful for (1) Dutch and German item generation, because no such stimulus-selection tool exists for these languages, (2) the generation of nonwords for all four languages, because our program has some important advantages over previous nonword generation approaches, and (3) psycholinguistic experiments on bilingualism, because the possibility of using the same tool for different languages increases the cross-linguistic comparability of the generated item lists. WordGen is free and available athttp://expsy.ugent.be/wordgen.htm.  相似文献   

15.
A comparison of two techniques for reducing context-dependent forgetting   总被引:2,自引:0,他引:2  
Recall is poorer when tested in a new environment than when tested in the original learning context. Two techniques for reducing this context-dependent forgetting were compared. One technique involved instructing subjects to recall their learning room(s), and the other attempted to establish multiple environmental retrieval cues by presenting lists in multiple rooms rather than all in the same room. Subjects were given three word lists to study in one or three rooms. All subjects were given a free-recall test in a new room, and half were asked to use remembered environmental context (EC) information to facilitate word memory. Multiple input contexts benefited only subjects who were uninstructed in the use of EC cues. Subjects given EC-recall instructions, however, recalled somewhat less in the three-room condition than in the one-room condition. The facilitative effects of the two techniques were not additive: EC-recall instructions benefited only one-room subjects. The results suggest that both EC-recall instructions and multiple learning contexts induce subjects to use contextual retrieval cues that are otherwise not spontaneously utilized, and that the greater the number of context cues stored in memory, the less accessible those cues become.  相似文献   

16.
Preexisting word knowledge is accessed in many cognitive tasks, and this article offers a means for indexing this knowledge so that it can be manipulated or controlled. We offer free association data for 72,000 word pairs, along with over a million entries of related data, such as forward and backward strength, number of competing associates, and printed frequency. A separate file contains the 5,019 normed words, their statistics, and thousands of independently normed rhyme, stem, and fragment cues. Other files provide n x n associative networks for more than 4,000 words and a list of idiosyncratic responses for each normed word. The database will be useful for investigators interested in cuing, priming, recognition, network theory, linguistics, and implicit testing applications. They also will be useful for evaluating the predictive value of free association probabilities as compared with other measures, such as similarity ratings and co-occurrence norms. Of several procedures for measuring preexisting strength between two words, the best remains to be determined. The norms may be downloaded from www.psychonomic.org/archive/.  相似文献   

17.
English‐learning 7.5‐month‐olds are heavily biased to perceive stressed syllables as word onsets. By 11 months, however, infants begin segmenting non‐initially stressed words from speech. Using the same artificial language methodology as Johnson and Jusczyk (2001 ), we explored the possibility that the emergence of this ability is linked to a decreased reliance on prosodic cues to word boundaries accompanied by an increased reliance on syllable distribution cues. In a baseline study, where only statistical cues to word boundaries were present, infants exhibited a familiarity preference for statistical words. When conflicting stress cues were added to the speech stream, infants exhibited a familiarity preference for stress as opposed to statistical words. This was interpreted as evidence that 11‐month‐olds weight stress cues to word boundaries more heavily than statistical cues. Experiment 2 further investigated these results with a language containing convergent cues to word boundaries. The results of Experiment 2 were not conclusive. A third experiment using new stimuli and a different experimental design supported the conclusion that 11‐month‐olds rely more heavily on prosodic than statistical cues to word boundaries. We conclude that the emergence of the ability to segment non‐initially stressed words from speech is not likely to be tied to an increased reliance on syllable distribution cues relative to stress cues, but instead may emerge due to an increased reliance on and integration of a broad array of segmentation cues.  相似文献   

18.
《Cognitive development》1995,10(2):201-224
Previous studies have found that children can use social-pragmatic cues to determine “which one” of several objects or “which one’ of several actions an adult intends to indicate with a novel word. The current studies attempted to determine whether children can also use such cues to determine “what kind” of referent, object, or action, an adult intends to indicate. In the first study, 27-month-old children heard an adult use a nonce word in conjunction with a nameless object while it was engaged in a nameless action. The discourse situation leading into this naming event was manipulated so that in one condition the target action was the one new element in the discourse context at the time of the naming event, and in another condition the target object was the one new element. Results showed that children learned the new word for whichever element was new to the discourse context. The second study followed this same general method, but in this case children in one condition watched as an adult engaged in preparatory behaviors that indicated her desire that the child perform the action before she produced the novel word, whereas children in another condition saw no such preparation. Results showed that children who saw the action preparation learned the new word for the action, whereas children who saw no preparation learned the new word for the object. These two studies demonstrate the important role of social-pragmatic information in early word learning, and suggest that if there is a Whole Object assumption in early lexical acquisition, it is an assumption that may be very easily overridden.  相似文献   

19.
This article presents a new database of 2,654 German nouns rated by a sample of 3,907 subjects on three psycholinguistic attributes: concreteness, valence, and arousal. As a new means of data collection in the field of psycholinguistic research, all ratings were obtained via the Internet, using a tailored Web application. Analysis of the obtained word norms showed good agreement with two existing norm sets. A cluster analysis revealed a plausible set of four classes of nouns: abstract concepts, aversive events, pleasant activities, and physical objects. In an additional application example, we demonstrate the usefulness of the database for creating parallel word lists whose elements match as closely as possible. The complete database is available for free from ftp://ftp.uni-duesseldorf.de/ pub/psycho/lahl/WWN. Moreover, the Web application used for data collection is inherently capable of collecting word norms in any language and is going to be released for public use as well.  相似文献   

20.
The objective of the current study was to investigate whether emotion pictorial cues increase memory specificity among non‐clinical participants. Undergraduate university students were presented with emotion word and pictorial cues on a prompted and non‐prompted version of the Autobiographical Memory Test (AMT). In comparison to pictorial cues, participants retrieved significantly fewer specific autobiographical memories in response to word cues on the prompted AMT; however, there was no significant difference on the non‐prompted AMT. Participants also retrieved significantly fewer specific memories in response to both word and pictorial cues on the non‐prompted AMT compared with the prompted AMT. These results provide support for the hypothesis that among non‐clinical participants, visual cues increase memory specificity over and above emotion. Further research is needed to investigate ways in which memory specificity can be increased and the use of imagery may be a promising avenue.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号