首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 31 毫秒
Automatic generation of questions and evaluating their answers is a highly challenging task in natural language processing and educational technology. This work focuses on generating subjective questions and also an evaluation system is suggested for assessing the answers. For generating the questionnaires, key-phrases are extracted from the course curriculum (syllabus). Next, based on the key-phrases, different types of subjective questions are generated. Finally, the evaluation of student’s responses is achieved using a multi-criteria-decision-making approach. It uses a set of model answers taken from different textbooks and subject experts to evaluate the answers. Multiple measures are used to assess the answers by comparing them with this model set. The results of the profound system reveal that the automated appraisal process can reduce the manual effort of the human.  相似文献   

We report two experiments that investigated the regulation of memory accuracy with a new regulatory mechanism: the plurality option. This mechanism is closely related to the grain-size option but involves control over the number of alternatives contained in an answer rather than the quantitative boundaries of a single answer. Participants were presented with a slideshow depicting a robbery (Experiment 1) or a murder (Experiment 2), and their memory was tested with five-alternative multiple-choice questions. For each question, participants were asked to generate two answers: a single answer consisting of one alternative and a plural answer consisting of the single answer and two other alternatives. Each answer was rated for confidence (Experiment 1) or for the likelihood of being correct (Experiment 2), and one of the answers was selected for reporting. Results showed that participants used the plurality option to regulate accuracy, selecting single answers when their accuracy and confidence were high, but opting for plural answers when they were low. Although accuracy was higher for selected plural than for selected single answers, the opposite pattern was evident for confidence or likelihood ratings. This dissociation between confidence and accuracy for selected answers was the result of marked overconfidence in single answers coupled with underconfidence in plural answers. We hypothesize that these results can be attributed to overly dichotomous metacognitive beliefs about personal knowledge states that cause subjective confidence to be extreme.  相似文献   

Computers were used to evaluate the effects of supplying answers to programmed instruction frames. A group experimental design compared passive reading, covert responding to frame blanks, and actively typing answers to blanks with and without immediate confirmation of correctness. Effects of a 315-frame program, teaching elements of programmed instruction design, were evaluated by analyzing answers to posttest generalization questions and an application test. Results strongly supported the effectiveness of requiring the student to supply fragments of a terminal repertoire while working through a program. Students who could either covertly respond to frame blanks or who were required to type frame answers performed significantly better on the frame generalization posttest and, more importantly, carefully followed program rules when preparing elements of a new instructional program.  相似文献   

Two experiments were conducted to investigate the nature of the intuitive problem representation used in evaluating mathematical strategies. The first experiment tested between two representations: a representation composed of principles and an integrated representation. Subjects judged the correctness of unseen math strategies based only on the answers they produced for a set of temperature mixture problems. The distance of the given answers from the correct answers and whether the answers violated one of the principles of temperature mixture were manipulated. The results supported the principle representation hypothesis. In the second experiment we manipulated subjects’ understanding of an acid mixture task with a brief paragraph of instruction on one of the principles. Subjects then completed an estimation task intended to measure their understanding of the problem domain. The evaluation task from the first experiment was then presented, but with acid mixture instead of temperature mixture. The results showed that intuitive understanding of the domain mediates the effect of instruction on evaluating problems. Additionally, the results supported the hypothesis that subjects perform a mapping process between their intuitive understanding and math strategies.  相似文献   

Previous research has shown that certain interviewer behaviors can evoke inaccurate answers by children. In the current study, we examined the effects of approving and disapproving statements on the accuracy of 3 children's answers to questions in an interview (Experiment 1). We then evaluated 3 questioning techniques that may be used by interviewers during a forensic interview in which a child provides eyewitness testimony (Experiment 2). All participants responded with more inaccurate answers when approving statements followed inaccurate information and disapproving statements followed accurate information in Experiment 1. During Experiment 2, 1 participant responded most inaccurately when she was requestioned after providing an initial answer, whereas the remaining 2 participants responded most inaccurately when the interviewer provided cowitness information and suggestive questions.  相似文献   

A new approach to problem solving was applied to multisolution problems in a memory search task. Subjects memorized a list of eight four-letter foods, and then searched mentally through the list for answers to questions. The times between successive answers (IRTs) were recorded along with the answers themselves. This allowed a comparison of two possible memory search strategies: (1) sampling with replacement, and (2) sampling without replacement. The results were largely in agreement with the sampling-without-replacement strategy. However, a more detailed breakdown of the data revealed that most subjects searched through the list in a rigid serial order. Further, an analysis of questions with identical answers showed that the IRTs were very nearly additive. This led to an additive time component model based on the independent summation of (a) read-in time, (b) memory-search time, (c) decision-making time, and (d)response-output time. This approach appeared generally more satisfactory than previous attempts to account for problem-solving behavior.  相似文献   

By watching and responding to the way a shill answered “yes-no” questions about food items, a developmentally delayed preschool boy not only greatly improved over his poor baseline “yes-no” answers to these same items, but also to “yes-no” answers during generalization probe sessions to untrained objects that included picture cards and body parts. Correct “yes-no” answers during follow-up sessions was also high for both the trained and untrained objects. The suggested mechanism for improvement was the elaborated from of answers the shill issues, which served to highlight and prompt “yes” versus “no” answers. Generalized “yes-no” answers improved as the training components (removing the shill and prior labeling questions) approximated the form of the generalized probe sessions. Experiment II confirmed that as soon as the labels for new objects were taught, “yes-no” answers to all labeled objects was immediately perfect and remained so thereafter. To nonlabled objects, “yes-no” answers were initially at chance level, but these answers for certain nonlabeled objects subsequently improved and their names were learned without explicit training, apparently through prior correct “yes-no” responding.  相似文献   

Our actions and decisions are regularly influenced by the social environment around us. Can social cues be leveraged to induce curiosity and affect subsequent behavior? Across two experiments, we show that curiosity is contagious: The social environment can influence people's curiosity about the answers to scientific questions. Participants were presented with everyday questions about science from a popular on-line forum, and these were shown with a high or low number of up-votes as a social cue to popularity. Participants indicated their curiosity about the answers, and they were given an opportunity to reveal a subset of those answers. Participants reported greater curiosity about the answers to questions when the questions were presented with a high (vs. low) number of up-votes, and they were also more likely to choose to reveal the answers to questions with a high (vs. low) number of up-votes. These effects were partially mediated by surprise and by the inferred usefulness of knowledge, with a more dramatic effect of low up-votes in reducing curiosity than of high up-votes in boosting curiosity. Taken together, these results highlight the important role social information plays in shaping our curiosity.  相似文献   

再探猜谜作业中“顿悟”的ERP效应   总被引:9,自引:2,他引:7  
采用事件相关电位(ERP)技术探讨顿悟问题(字谜)解决中提供答案后的脑内时程动态变化。结果发现,在250~400 ms内,“有顿悟”和“不理解”比“无顿悟”的ERP波形均有一个更为负向的偏移。在“有顿悟—无顿悟”和“不理解—无顿悟”的差异波中,这个负成分的潜伏期约为320 ms (N320),地形图显示,N320在中后部活动最强。进一步对“有顿悟—无顿悟”差异波作偶极子溯源分析,发现N320主要起源于扣带前回(ACC)附近。这似乎表明,N320可能反映了提供答案瞬间新旧思路之间的认知冲突,但是却不能真正揭示顿悟问题解决中思维定势的成功突破以及“恍然大悟”所对应的独特脑内时程变化  相似文献   


In this article, I describe and systematize the different answers to the question ‘What is ubuntu,?’ that I have been able to identify among South Africans of African descent (SAADs). I show that it is possible to distinguish between two clusters of answers. The answers of the first cluster all define ubuntu, as a moral quality of a person, while the answers of the second cluster all define ubuntu, as a phenomenon (for instance a philosophy, an ethic, African humanism, or, a worldview) according to which persons are interconnected. The concept of a person is of central importance to all the answers of both clusters, which means that to understand these answers, it is decisive to raise the question of who counts, as a person according to SAADs. I show that some SAADs define all Homo sapiens, as persons, whereas others hold the view that only some Homo sapiens, count as persons: only those who are black, only those who have been incorporated into personhood, or only those who behave in a morally acceptable manner.  相似文献   

Research with adults indicates that confidence in the correctness of an answer decreases as a function of the amount of time it takes to reach that answer, suggesting that people use response latency as a mnemonic cue for subjective confidence. Experiment 1 extended investigation to 2nd, 3rd and 5th graders. When children chose the answer to general knowledge questions, their confidence in the answer was inversely related to choice latency. However, the strength of the relationship increased with grade, suggesting increased reliance with age on the feedback from task performance. The validity of latency as a cue for the accuracy of the answer also increased with age, possibly contributing to the observed age increase in the extent to which confidence judgment discriminated between correct and wrong answers. Whereas these results illustrate the dependence of metacognitive monitoring on the feedback from control operations, Experiments 2 and 3 examined the idea that control‐based monitoring affects subsequent control operations. When children were free to choose which answers to volunteer under a payoff schedule that emphasized accuracy, they tended to volunteer high‐confidence answers more than low‐confidence answers (Experiment 2) and more short‐latency answers than long‐latency answers (Experiment 3). The latter tendency was again stronger for older than for younger children. The results are discussed in terms of the intricate relationships between monitoring and control processes.  相似文献   

The merits of unconscious thought in creativity   总被引:1,自引:0,他引:1  
Research has yielded weak empirical support for the idea that creative solutions may be discovered through unconscious thought, despite anecdotes to this effect. To understand this gap, we examined the effect of unconscious thought on two outcomes of a remote-association test (RAT): implicit accessibility and conscious reporting of answers. In Experiment 1, which used very difficult RAT items, a short period of unconscious thought (i.e., participants were distracted while holding the goal of solving the RAT items) increased the accessibility of RAT answers, but did not increase the number of correct answers compared with an equal duration of conscious thought or mere distraction. In Experiment 2, which used moderately difficult RAT items, unconscious thought led to a similar level of accessibility, but fewer correct answers, compared with conscious thought. These findings confirm and extend unconscious-thought theory by demonstrating that processes that increase the mental activation of correct solutions do not necessarily lead them into consciousness.  相似文献   

The goal of intelligent tutoring systems (ITS) that interact in natural language is to emulate the benefits that a well-trained human tutor provides to students, by interpreting student answers and appropriately responding in order to encourage elaboration. BRCA Gist is an ITS developed using AutoTutor Lite, a Web-based version of AutoTutor. Fuzzy-trace theory theoretically motivated the development of BRCA Gist, which engages people in tutorial dialogues to teach them about genetic breast cancer risk. We describe an empirical method to create tutorial dialogues and fine-tune the calibration of BRCA Gist’s semantic processing engine without a team of computer scientists. We created five interactive dialogues centered on pedagogic questions such as “What should someone do if she receives a positive result for genetic risk of breast cancer?” This method involved an iterative refinement process of repeated testing with different texts and successively making adjustments to the tutor’s expectations and settings in order to improve performance. The goal of this method was to enable BRCA Gist to interpret and respond to answers in a manner that best facilitated learning. We developed a method to analyze the efficacy of the tutor’s dialogues. We found that BRCA Gist’s assessment of participants’ answers was highly correlated with the quality of the answers found by trained human judges using a reliable rubric. The dialogue quality between users and BRCA Gist predicted performance on a breast cancer risk knowledge test completed after exposure to the tutor. The appropriateness of BRCA Gist’s feedback also predicted the quality of answers and breast cancer risk knowledge test scores.  相似文献   

Local and global judgments of confidence   总被引:3,自引:0,他引:3  
Studies of calibration have shown that people's mean confidence in their answers (local confidence) tends to be greater than their overall estimate of the percentage of correct answers (global confidence). Moreover, whereas the former exhibits overconfidence, the latter often exhibits underconfidence. Three studies present evidence that global underconfidence reflects a failure to make an allowance for correct answers that are likely to result from mere guessing and can be eliminated by informing participants of the dubious normative status of estimates below 50% (i.e., chance). Previously reported discrepancies between global and local confidence, it is concluded, arise less from possible methodological artifacts in assessment of local confidence than from normatively inappropriate assessments of global confidence.  相似文献   

ABSTRACT: Three experiments tested the prediction that incubation effects are caused by interactions between activation and environmental clues. Participants worked on 20 experimental problems and then were informed that they would have a second chance to work on the problems. Half were told they might see clues before returning to the problems and were instructed to try to use such clues. Participants then had an incubation period during which they generated words from the letters of test words. The test words were either semantically related to experimental problem answers, the actual answers, or unrelated words. Finally, all participants again tried to solve the experimental problems. Resolution, calculated as the number of items solved during the second trial that were not solved initially, was measured. Participants who saw answers during incubation resolved more items than those who saw related words. In Experiment 3, participants receiving no instructions did not differ across clue conditions, whereas instructed participants who saw answers resolved more problems than those who received related words. Participants in the instructed and unrelated condition performed significantly worse than those in the instructed and answer condition. Incubation effects occurred only when participants who were shown answers were also given instructions. No support was found for the theory that incubation effects are caused solely by environmental clues and activation.  相似文献   

The current study aims to investigate the relationship between right, wrong and missing answers to cognitive test items (test-taking patterns) in the context of the Flynn Effect (FE). We compare two cohorts of Estonian students (1933/36, n = 890; 2006, n = 913) using an Estonian adaptation of the National Intelligence Tests and document three simultaneous trends: fewer missing answers (− 1 Cohen's d averaged over subtests), and a rise in the number of right and wrong answers to the subtests (average ds of .86 and .30, respectively). In the Arithmetical Reasoning and Vocabulary subtests, adjustments for false-positive answers (the number of right minus the number of wrong answers) reduced the size of the Flynn Effect by half. These subtests were supposed to be high g-loading subtests. Our conclusion is that rapid guessing has risen over time and influenced tests scores more strongly over the years. The FE is partly explained by changes in test-taking behavior over time.  相似文献   

This study investigated the effects of repetition, memory, feedback, and hindsight bias on the realism in confidence in answers to questions on a filmed kidnapping. In Experiment 1 the participants showed overconfidence in all conditions. In the Repeat condition (‘how confident are you now that your previous answers are correct’) overconfidence was reduced as a consequence of the decrease in confidence in both correct and incorrect answers compared with the Repeat condition when the participants received feedback on their answers and were asked to remember their initial confidence, the confidence level was higher for correct and lower for incorrect answers. In Experiment 2, recalled confidence (the Memory condition) increased compared with the original confidence both for correct and incorrect answers; the effect of this was increased overconfidence. The Hindsight condition showed a decrease in confidence in incorrect answers. The results suggest that a unique hindsight effect may be more clearly present for incorrect than for correct answers. Our study gives further evidence for the malleability of the realism in eyewitness confidence and we discuss both the theoretical and forensic implications of our findings. Copyright © 2000 John Wiley & Sons, Ltd.  相似文献   

To examine the role of accuracy motivation in event recall, 6-, 7-, and 8-year-old children and adults were shown a short video about a conflict between two groups of children. Three weeks later, participants were asked a set of unbiased specific questions about the video. Following A. Koriat and M. Goldsmith's (1994) distinction of quantity- and quality-oriented memory assessments, and based on their model of strategic regulation of memory accuracy (1996), accuracy motivation was manipulated across three conditions. Participants were (a) forced to provide an answer to each question (low accuracy motivation), (b) initially instructed to withhold uncertain answers by saying "I don't know" (medium accuracy motivation), or (c) rewarded for every single correct answer (high accuracy motivation). When motivation for accuracy was high, children as young as 6 were to withhold uncertain answers to the benefit of accuracy. The expected quality-quantity trade-off emerged only for peripheral items but not for the central items. Participants who were forced to provide an answer gave more correct answers but also high numbers of incorrect answers than participants who had the option to answer "I don't know." The results are discussed in terms of the underlying model as well as in terms of forensic interviewing.  相似文献   

Kristie Miller 《Erkenntnis》2010,73(2):211-235
There is a good deal of disagreement about composition. There is first-order disagreement: there are radically different answers to the special composition question—the question of under what circumstances the xs compose a y. There is second-order disagreement: there are different answers to the question of whether first-order disagreement is real or merely semantic. Virtually all disputants with respect to both the first- and second-order issues agree that the answer or answers to the special composition question will take the form of a necessary truth or truths even though, as I will argue, such answers do not appear to be good candidates to be necessary truths. This paper provides an analysis of the concept of <exists> as it pertains to concrete objects, that fulfils two functions. First, it explicates the sense in which claims about composition are contingent and the sense in which they are necessary, and second, it provides a way of understanding when first-order disputes are substantial and when they are merely semantic.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号