首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
The crux in psychometrics is how to estimate the probability that a respondent answers an item correctly on one occasion out of many. Under the current testing paradigm this probability is estimated using all kinds of statistical techniques and mathematical modeling. Multiple evaluation is a new testing paradigm using the person's own personal estimates of these probabilities as data. It is compared to multiple choice, which appears to be a degenerated form of multiple evaluation. Multiple evaluation has much less measurement error than multiple choice, and this measurement error is not in favor of the examinee. When the test is used for selection purposes as it is with multiple choice, the probability of a Type II error (unjustified passes) is almost negligible. Procedures for statistical item-and-test analyses under the multiple evaluation paradigm are presented. These procedures provide more accurate information in comparison to what is possible under the multiple choice paradigm. A computer program that implements multiple evaluation is also discussed.  相似文献   

3.
The idea that naturally sampled frequencies facilitate performance in statistical reasoning tasks because they are a cognitively privileged representational format has been challenged by findings that similarly structured numbers presented as chances similarly facilitate performance, on the basis of the claim that these are technically single-event probabilities. A crucial opinion, however, is that of the research participants, who possibly interpret chances as de facto frequencies. A series of experiments here indicate that not only is performance improved by clearly presented natural frequencies, rather than chances phrasing, but also that participants who interpreted chances as frequencies, rather than as probabilities, were consistently better at statistical reasoning. This result was found across different variations of information presentation and across different populations.  相似文献   

4.
This single case study was designed to gather evidence regarding whether the mental representations mediating multiplication fact retrieval make use of single or multiple codes. MC is a brain-damaged volunteer whose numerical processing impairments were limited to multiplication fact retrieval. He relearned three sets of multiplication facts. Each set was relearned in one of three input formats: Arabic, written verbal, or spoken verbal. Following training all facts were tested in all input formats. MC's posttraining performance was virtually error free and showed no effects of input format. However, reaction-time data showed fact retrieval was fastest when the training format matched the test format. Results are discussed in relation to single- and multiple-code models of multiplication fact retrieval.  相似文献   

5.
The authors investigated whether performance on mathematical test items would be influenced by an interaction between presentation format and gender. One hundred fourteen students in a management accounting course were randomly assigned either to a tabular format or to a graphics format. There were significant main effects for gender and presentation format; men outperformed women, and the subjects who received the tabular format outperformed the subjects who received the graphics format. A significant interaction supported the existence of a conditional relationship between performance on mathematical test items and presentation format. This relationship varied as a function of gender (symmetry permits the interchange of presentation format and gender). Simple effects for the interaction determined that the women who received the graphics presentation did not perform as well as their male counterparts, or as well as other women and men who received the tabular format. The results of this study indicate that presentation format is an important consideration in gender differences for mathematics performance.  相似文献   

6.
Among current state-of-the-art estimation methods for multilevel IRT models, the two-stage divide-and-conquer strategy has practical advantages, such as clearer definition of factors, convenience for secondary data analysis, convenience for model calibration and fit evaluation, and avoidance of improper solutions. However, various studies have shown that, under the two-stage framework, ignoring measurement error in the dependent variable in stage II leads to incorrect statistical inferences. To this end, we proposed a novel method to correct both measurement bias and measurement error of latent trait estimates from stage I in the stage II estimation. In this paper, the HO-IRT model is considered as the measurement model, and a linear mixed effects model on overall (i.e., higher-order) abilities is considered as the structural model. The performance of the proposed correction method is illustrated and compared via a simulation study and a real data example using the National Educational Longitudinal Survey data (NELS 88). Results indicate that structural parameters can be recovered better after correcting measurement biases and errors.  相似文献   

7.
This paper details the results of an empirical investigation of the random errors associated with decomposition estimates of multiattribute utility. In a riskless setting, two groups of subjects were asked to evaluate multiattribute alternatives both holistically and with the use of an additive decomposition. For one group, the alternatives were described in terms of three attributes, and for the other in terms of five. Estimates of random error associated with the various elicitations (holistic, single-attribute utility, scaling constants, or weights) were obtained using a test-retest format. It was found for both groups that the additive decomposition had significantly smaller levels of random error than the holistic evaluation. However, the number of attributes did not seem to make a significant difference to the amount of random error associated with the decomposition estimates. The levels of error found in the various elicitations were consistent with theoretical bounds that have recently been proposed in the literature. These results show that the structure imposed on the problem through decomposition results in measurable improvement in quality of the multiattribute utility judgements, and contribute to a greater understanding of the decomposition method in decision analysis.  相似文献   

8.
The Levels of Emotional Awareness Scale (LEAS) is a performance‐based measure of emotional awareness. This study examined whether the LEAS is suitable to be administered orally by administering two half‐forms of the LEAS to literate participants; one orally and one in written format. In doing so, this study raised questions regarding the internal reliability and statistical equivalence of the LEAS half‐forms. Despite this, results showed no significant difference between oral and written administration. Further, the correlation between scores obtained through oral and written administration was no less than the correlation between the LEAS‐A and LEAS‐B half‐forms. Together, these results suggest that, in circumstances where administering the written format of the LEAS is not possible, this scale may be administered orally.  相似文献   

9.
Occupational role performance is a key concept of occupational therapy theory and practice. The Role Change Assessment was developed to facilitate the systematic evaluation of the rolc performance of older adults. It uses a semi-structured interview format to examine 48 roles in family and social, vocational, self-care. organizational, leisure, and health care categories. This article describes the need for and development of the instrument and illustratcs its use through case reports.  相似文献   

10.
This article examines how between-individual comparisons influence performance evaluations in rating tasks. The authors demonstrated a systematic change in the perceived difference across ratees as a result of changing the way performance information is expressed. Study 1 found that perceived performance difference between 2 individuals was greater when their objective performance levels were presented with small numbers (e.g., absence rates of 2% vs. 5%) than when they were presented with large numbers (e.g., attendance rates of 98% vs. 95%). Extending this finding to situations involving trade-offs between multiple performance attributes across ratees, Study 2 showed that the relative preference for 1 ratee over another actually reversed when the presentation format of the performance information changed. The authors draw upon prospect theory to offer a theoretical framework describing the between-individual comparison aspect of performance evaluation.  相似文献   

11.
This paper examines how the presentation of computer-monitored performance information affects performance judgments. Two factors were examined: the performance pattern and the information format. In a computer simulation, subjects were responsible for evaluating the performance of a computer-monitored typist. They were assigned to one of three format conditions: a periodic, delayed, or summarized format. The pattern of the typist's performance was also varied: It either improved, worsened, or remained about the same during the simulation. Results indicate that performance pattern affected subjects' ratings of overall performance, performance quality, and performance consistency. Both factors influenced ratings of future performance and recall of specific performance information. Implications of these results for performance appraisals and computerized performance monitoring systems are discussed.  相似文献   

12.
Gary   L.   Brase 《心理学报》2007,39(3):398-405
当形式操纵有助于贝叶斯推理时会有怎样的加工发生呢?一种观点认为自然取样的频率可以激发在其操作中具有相对特异性的特权表征系统。而与之相对的一种观点则认为,自然取样频率只是引发具有嵌套关系的更为普遍的加工的一种方式。比较两种观点,后者预示着只需要使用相当简要和直接的干预(如简单的指示)就能够促进推理的改善,而前者则意味着更为广泛的干预和/或更有洞见的理解才能改善推理。本研究表明,无论是短暂立即的干预,还是预存的表征偏向,抑或是表征的灵活性都不能促进被试的表现。另一方面,也有证据显示,频率论者的问题解释可以改善统计推理表现,而且有时还会增加其反应的信心。这些结果支持了特权表征系统观  相似文献   

13.
Manolov R  Arnau J  Solanas A  Bono R 《Psicothema》2010,22(4):1026-1032
The present study evaluates the performance of four methods for estimating regression coefficients used to make statistical decisions about intervention effectiveness in single-case designs. Ordinary least square estimation is compared to two correction techniques dealing with general trend and a procedure that eliminates autocorrelation whenever it is present. Type I error rates and statistical power are studied for experimental conditions defined by the presence or absence of treatment effect (change in level or in slope), general trend, and serial dependence. The results show that empirical Type I error rates do not approach the nominal ones in the presence of autocorrelation or general trend when ordinary and generalized least squares are applied. The techniques controlling trend show lower false alarm rates, but prove to be insufficiently sensitive to existing treatment effects. Consequently, the use of the statistical significance of the regression coefficients for detecting treatment effects is not recommended for short data series.  相似文献   

14.
In simulation studies, the F test for differences in regression slopes has tended to distort nominal Type I and II error rates when the 2 subgroup error variances exceeded a 1.50:1 ratio. This study examines the frequency and extent that this ratio is violated within data sets relevant to applied psychology. The General Aptitude Test Battery (GATB) validity study database contained ability data and overall job performance ratings. The Project A military database contained both ability and personality data, along with job performance factor scores and an overall job performance rating. Results suggest that subgroup (White-Black, male-female) error variances are often homogeneous enough to support F test results from past empirical work. Enough heterogeneity was found, however, to urge applied psychologists investigating differential prediction to explore their data and consider the possibility of alternative statistical tests.  相似文献   

15.

Since the early days of COVID-19, university teaching has changed from face-to-face format to online mode. With the gradual containment of the pandemic, there is no need for school lockdown. As a result, the teaching format has changed to HyFlex mode integrating both face-to-face and online modes. Obviously, it is necessary to understand the academic quality of life among students under the Hyflex teaching mode. In this paper, we report an evaluation study on a leadership subject in Hong Kong delivered via HyFlex teaching using a post-lecture evaluation strategy. In one of the lectures, we covered law-abiding leadership in university students, including abiding by the Hong Kong National Security Law. The post-lecture evaluation showed that students generally held positive views toward the HyFlex teaching and they perceived that the subject promoted their well-being indexed by psychosocial competence. Regarding the lecture on law-abiding leadership, students agreed that the lecture promoted their psychosocial competence, personal development, knowledge about law-abiding behavior and national security (including the Hong Kong National Security Law), and readiness to serve as socially responsible leaders. Positive perceptions of the lecture design, teacher performance, lecture content of law-abiding leadership and national security, and benefits positively predicted students’ overall satisfaction with the lecture on law-abiding leadership and national security.

  相似文献   

16.
Theories of skilled music performance must account for variations on what is written in traditional musical notation. Some variations are intentional and reflect structural features of the music that are chosen for emphasis by the performer. Current music notations are inadequate to reflect these variations. Computer applications are described that allow graphical and statistical examination of performance variations on traditional musical notation. An integrated set of visual and sound tools is provided that allows music to be recorded, edited, analyzed, and played back on electronic and acoustic musical instruments. The graphical format allows flexibility through the use of windows, compression, expansion, and scrolling of multiple sources of information, mapping of acoustic to visual dimensions, and scaling of different performance parameters without normalization. Experimental evidence from piano performances is used to demonstrate how graphical formats can aid research on human performance.  相似文献   

17.
Newell, Mitchell, and Hayes (NMH) conduct three experiments designed to test whether exemplar cuing (EC) theory or a statistical format theory provides a more accurate account for how people make judgments about low‐probability events. They report finding support for the statistical format theory and little or no support for EC. However, NMH misstate the requirements for the production of exemplars in EC theory. As a result, they confuse non‐exemplar conditions with exemplar conditions in their experiments, and find results that are virtually irrelevant to EC theory. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

18.
Although many nonlinear models of cognition have been proposed in the past 50 years, there has been little consideration of corresponding statistical techniques for their analysis. In analyses with nonlinear models, unmodeled variability from the selection of items or participants may lead to asymptotically biased estimation. This asymptotic bias, in turn, renders inference problematic. We show, for example, that a signal detection analysis of recognition memory data leads to asymptotic underestimation of sensitivity. To eliminate asymptotic bias, we advocate hierarchical models in which participant variability, item variability, and measurement error are modeled simultaneously. By accounting for multiple sources of variability, hierarchical models yield consistent and accurate estimates of participant and item effects in recognition memory. This article is written in tutorial format; we provide an introduction to Bayesian statistics, hierarchical modeling, and Markov chain Monte Carlo computational techniques.  相似文献   

19.
20.
PROCTOR, an on-line interactive system for student evaluation and monitoring in courses employing the PSI format, is described. This system involves a tester module that assumes many of the routine managerial duties traditionally assigned to the student proctor, such as quiz administration and scoring, performance feedback, and complete record keeping. Also, PROCTOR includes a multipurpose editor that allows an instructor to enter, examine, or modify any information in the disk files accessed by the tester. Factors related to the implementation and use of this system are considered, and its advantages for management, quality control, and staffing in PSI courses are discussed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号