首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Abstract

In 4 experiments, the authors examined how several variables influence the quality and quantity of information that people use to make judgments about other people. The results showed that when possible, participants consistently responded appropriately to variables that influenced information that they used to make inferences about other minds. The results also suggested that under circumstances with no opportunity to contrast behavior in different situations, people might not be sensitive to the quality and quantity of information present. The authors interpreted results to mean that under most circumstances, people make inferences in a way that efficiently uses information about the causes of behavior.  相似文献   

2.
This study presents a psychometric evaluation of the Expanded Cognitive Reflection Test (CRT7) based on item response theory. The participants (N?=?1204) completed the CRT7 and provided self-reported information about their cognitive styles through the Preference for Intuition and Deliberation Scale (PID). A two-parameter logistic model was fitted to the data to obtain the item difficulty and discrimination parameters of the CRT7. The results showed that the items had good discriminatory power (αs?=?.80 ? 2.92), but the range of difficulty was restricted (βs ranged from ?.60 to .32). Moreover, the CRT7 showed a pattern of correlations with the PID which was similar to that of the original CRT. When taken together, these results are evidence of the adequacy of the CRT7 as an expanded tool for measuring cognitive reflection; however, one of the newer items (the pig item) was consistently problematic across analyses, and so it is recommended that in future studies it should be removed from the CRT7.  相似文献   

3.
Information functions are used to find the optimum ability levels and maximum contributions to information for estimating item parameters in three commonly used logistic item response models. For the three and two parameter logistic models, examinees who contribute maximally to the estimation of item difficulty contribute little to the estimation of item discrimination. This suggests that in applications that depend heavily upon the veracity of individual item parameter estimates (e.g. adaptive testing or text construction), better item calibration results may be obtained (for fixed sample sizes) from examinee calibration samples in which ability is widely dispersed.This work was supported by Contract No. N00014-83-C-0457, project designation NR 150-520, from Cognitive Science Program, Cognitive and Neural Sciences Division, Office of Naval Research and Educational Testing Service through the Program Research Planning Council. Reproduction in whole or in part is permitted for any purpose of the United States Government. The author wishes to acknowledge the invaluable assistance of Maxine B. Kingston in carrying out this study, and to thank Charles Lewis for his many insightful comments on earlier drafts of this paper.  相似文献   

4.
In 4 experiments, the authors examined how several variables influence the quality and quantity of information that people use to make judgments about other people. The results showed that when possible, participants consistently responded appropriately to variables that influenced information that they used to make inferences about other minds. The results also suggested that under circumstances with no opportunity to contrast behavior in different situations, people might not be sensitive to the quality and quantity of information present. The authors interpreted results to mean that under most circumstances, people make inferences in a way that efficiently uses information about the causes of behavior.  相似文献   

5.
This paper demonstrates that choice processing may be different in missing information situations than in full information situations depending on whether inferences are used to fill in missing values and the overlap of the missing information itself. It is shown that when individuals do not form inferences to fill in missing values, fewer full attribute-based processes and more processes which accommodate for missing attribute values, alternative-based or given-dimension attribute-based, are used. It is also shown that when a processing shift due to missing information does occur, the overlap of the missing values will affect the type of shift that takes place. If overlap is high, a shift to given-dimension attribute-based processing is more likely, and when overlap is low, a shift to alternative-based processing is more likely. When individuals do form inferences to fill in missing values, processing is more similar to that in full information situations. Finally, it is shown that individuals will often partially fill in missing information, thus moderating the proposed effects.  相似文献   

6.
认知诊断是新一代测量理论的核心, 对形成性教学评估具有重要意义。项目认知属性标定是认知诊断中一项基础而重要的工作,现有的项目认知属性辅助标定方法的研究工作很少, 并且在应用上存在诸多局限。课堂评估是认知诊断应用的理想场所,但课堂评估中项目的选取具有随意性, 教师难以在短时间内准确标识项目认知属性。本研究首次提出采用粗糙集方法对项目认知属性进行标定, 该方法无需太多被试和项目, 亦无需已知项目参数, 且能当场诊断出结果, 适于采用纸笔测验的课堂评估。通过Monte Carlo模拟研究表明:采用粗糙集方法能迅速地对项目认知属性进行标定, 并具有较高的标定准确率; 而且, 项目认知属性越少、或被试估计判准率越高、或失误率越小则项目认知属性标定的准确率越高。粗糙集方法的引入, 对拓展认知诊断的应用范围, 真正实现其辅助性教学功能, 具有重要作用。  相似文献   

7.
A plausibles-factor solution for many types of psychological and educational tests is one that exhibits a general factor ands − 1 group or method related factors. The bi-factor solution results from the constraint that each item has a nonzero loading on the primary dimension and at most one of thes − 1 group factors. This paper derives a bi-factor item-response model for binary response data. In marginal maximum likelihood estimation of item parameters, the bi-factor restriction leads to a major simplification of likelihood equations and (a) permits analysis of models with large numbers of group factors; (b) permits conditional dependence within identified subsets of items; and (c) provides more parsimonious factor solutions than an unrestricted full-information item factor analysis in some cases. Supported by the Cognitive Science Program, Office of Naval Research, Under grant #N00014-89-J-1104. We would like to thank Darrell Bock for several helpful suggestions.  相似文献   

8.
Source identification refers to memory for the origin of information. A consistent nomenclature is introduced for empirical measures of source identification which are then mathematically analyzed and evaluated. The ability of the measures to assess source identification independently of identification of an item as old or new depends on assumptions made about how inconsistencies between the item and source components of a source-monitoring task may be resolved. In most circumstances, the empirical measure that is used most often when source identification is measured by collapsing across pairs of sources (sometimes called “the identification-of-origin score”) confounds item identification with source identification. Alternative empirical measures are identified that do not confound item and source identification in specified circumstances. None of the empirical measures examined provides a valid measure of source identification in all circumstances.  相似文献   

9.
In this paper I argue that defeasible (nonmonotonic) inferences are occasion-sensitive: the inferential connections of a given claim (plus collateral premises) depend on features of the circumstances surrounding the occasion of inference (including typical environmental conditions, and also pragmatic and contextual factors, such as the information available to agents and how high stakes are). More specifically, it is an occasion-sensitive matter which possible defeaters have to be considered explicitly by the premises of an inference and which possible defeaters may remain unconsidered, without making the inference enthymematic. As a result, a largely unexplored form of occasion-sensitivity arises in inferentialist theories of content that appeal to defeasible inferences.  相似文献   

10.
Randomization-based inference about latent variables from complex samples   总被引:1,自引:0,他引:1  
Standard procedures for drawing inferences from complex samples do not apply when the variable of interest cannot be observed directly, but must be inferred from the values of secondary random variables that depend on stochastically. Examples are proficiency variables in item response models and class memberships in latent class models. Rubin's multiple imputation techniques yield approximations of sample statistics that would have been obtained, had been observable, and associated variance estimates that account for uncertainty due to both the sampling of respondents and the latent nature of. The approach is illustrated with data from the National Assessment for Educational Progress.This research was supported by Grant No. NIE-G-83-0011 of the Office for Educational Research and Improvement, Center for Education Statistics, and Contract No. N00014-88-K-0304, R&T 4421552 from the Cognitive Sciences Program, Cognitive and Neural Sciences Division, Office of Naval Research. It does not necessarily reflect the views of either agency. I am grateful to R. Darrell Bock for calling my attention to the applicability of multiple imputation to the assessment setting; to Albert Beaton and Eugene Johnson for enlightening discussions on the topic; and to Henry Braun, Ben King, Debra Kline, Gary Phillips, Paul Rosenbaum, Don Rubin, John Tukey, Ming-Mei Wang, Kentaro Yamamoto, Rebecca Zwick, and two anonymous reviewers for comments on earlier drafts. Example 4 is based on the analysis of the 1984 National Assessment for Educational Progress reading survey, carried out at Educational Testing Service through the tireless efforts of too many people to mention by name, under the direction of Albert Beaton, Director of NAEP Data Analyses. David Freund, Bruce Kaplan, and Jennifer Nelson conducted additional analyses of the 1984 and 1988 data for the example.  相似文献   

11.
A major research direction for ability measurement has been to identify the information-processes that are involved in solving test items through mathematical modeling of item difficulty. However, this research has had limited impact on ability measurement, since person parameters are not included in the process models. The current paper presents some multicomponent latent trait models for reproducing test performance from both item and person parameters on processing components. Components are identified from item subtasks, in which performance is a logistic function (i.e., Rasch model) of person and item parameters, and then are combined according to a mathematical model of processing on the composite item.The author would like to thank David Thissen for his invaluable insights concerning this model and an anonymous reviewer for his suggestion about the sample space for the model.This research was partially supported by National Institute of Education grant number NIE-6-7-0156 to Susan E. Whitely, principal investigator. However the opinions expressed herein do not necessarily reflect the position or policy of the National Institute of Education, and no official endorsement by the National Institute of Education should be referred. Part of this paper was presented at the annual meeting of thePsychometric Society, Monterey, California: June, 1979.  相似文献   

12.
认知诊断作为21世纪一种新的测量范式,在国内外越来越受到重视。该文运用MCMC算法实现了R-RUM的参数估计,并采用Monte Carlo模拟方法探讨其性能。研究结果表明:(1)R-RUM参数估计方法可行,估计精度较高;(2)Q矩阵复杂性和模型参数水平对模型参数估计精度有较大影响,随着r_(jk)*值的增大和Q矩阵复杂性的增加,项目参数和被试参数估计精度逐渐下降;(3)在特定情形下,R-RUM具有一定的稳健性。  相似文献   

13.
Most indexes of item validity and difficulty vary systematically with changes in the mean and variance of the group. Formulas are presented showing how certain item parameters will vary with these alterations in group mean and variance. Item parameters are also suggested which should remain invariant under such changes. These parameters are developed under two different assumptions: first, the assumption that thetotal distribution of the item ability variable is normal, and, second, that the distribution of the item ability variablefor each array of the explicit selection variable is normal. The writer wishes to acknowledge helpful discussions of this paper with Paul Horst and Herbert S. Sichel who have worked on various aspects of the problem of invariant item parameters.  相似文献   

14.
When an item response theory model fails to fit adequately, the items for which the model provides a good fit and those for which it does not must be determined. To this end, we compare the performance of several fit statistics for item pairs with known asymptotic distributions under maximum likelihood estimation of the item parameters: (a) a mean and variance adjustment to bivariate Pearson's X2, (b) a bivariate subtable analog to Reiser's (1996) overall goodness-of-fit test, (c) a z statistic for the bivariate residual cross product, and (d) Maydeu-Olivares and Joe's (2006) M2 statistic applied to bivariate subtables. The unadjusted Pearson's X2 with heuristically determined degrees of freedom is also included in the comparison. For binary and ordinal data, our simulation results suggest that the z statistic has the best Type I error and power behavior among all the statistics under investigation when the observed information matrix is used in its computation. However, if one has to use the cross-product information, the mean and variance adjusted X2 is recommended. We illustrate the use of pairwise fit statistics in 2 real-data examples and discuss possible extensions of the current research in various directions.  相似文献   

15.
EDITOR'S NOTE     
In the framework of a linear logistic testing model, Mislevy, Sheehan, and Wingersky (1993) showed how to incorporate collateral information in estimating item parameters required for test equating. The purpose of the study was to explore the feasibility of applying this method to equate tests constructed for college entrance examination by comparing its results with those of the item response theory (IRT) true-score equating. Overall, the equating results based on collateral information are relatively comparable with those of IRT equating. In terms of R2's, the prediction equations for item characteristics are good to excellent. The significant levels of correlation coefficients between IRT calibrated b (difficulty level) and predicted b parameters range from around .01 to .05. The goodness of fit of true-score test characteristic curves (TCCs) based on collateral information to IRT true-score TCCs are excellent. Results of the study are discussed in light of factors that may affect the validity of using collateral information in test equating.  相似文献   

16.
Replenishing item pools for on-line ability testing requires innovative and efficient data collection designs. By generating localD-optimal designs for selecting individual examinees, and consistently estimating item parameters in the presence of error in the design points, sequential procedures are efficient for on-line item calibration. The estimating error in the on-line ability values is accounted for with an item parameter estimate studied by Stefanski and Carroll. LocallyD-optimaln-point designs are derived using the branch-and-bound algorithm of Welch. In simulations, the overall sequential designs appear to be considerably more efficient than random seeding of items.This report was prepared under the Navy Manpower, Personnel, and Training R&D Program of the Office of the Chief of Naval Research under Contract N00014-87-0696. The authors wish to acknowledge the valuable advice and consultation given by Ronald Armstrong, Charles Davis, Bradford Sympson, Zhaobo Wang, Ing-Long Wu and three anonymous reviewers.  相似文献   

17.
This paper focuses on model interpretation issues and employs a geometric approach to compare the potential value of using the Grade of Membership (GoM) model in representing population heterogeneity. We consider population heterogeneity manifolds generated by letting subject specific parameters vary over their natural range, while keeping other population parameters fixed, in the marginal space (based on marginal probabilities) and in the full parameter space (based on cell probabilities). The case of a 2 × 2 contingency table is discussed in detail, and a generalization to 2J tables with J ≥ 3 is sketched. Our approach highlights the main distinction between the GoM model and the probabilistic mixture of classes by demonstrating geometrically the difference between the concepts of partial and probabilistic memberships. By using the geometric approach we show that, in special cases, the GoM model can be thought of as being similar to an item response theory (IRT) model in representing population heterogeneity. Finally, we show that the GoM item parameters can provide quantities analogous to more general logistic IRT item parameters. As a latent structure model, the GoM model might be considered a useful alternative for a data analysis when both classes of extreme responses, and additional heterogeneity that cannot be captured by those latent classes, are expected in the population. This work was supported by Award #1R03 AG18986-01 from the National Institute on Aging and NIH grant #1R01 CA94212-01. The presentation of the ideas in this paper owes much to discussions with Stephen Fienberg and Brian Junker, Carnegie Mellon University. The author thanks Jim Ramsay and two anonymous reviewers for their valuable comments on earlier drafts of this paper.  相似文献   

18.
《人类行为》2013,26(4):331-347
Preemployment drug and alcohol tests can provide valid information about current impairment andlor recent use of specific substances, but their relevance for making decisions about job applicants may be limited. We show how the testing method, the narrow window of time in which specific drugs or their metabolytes can be detected, and the circumstances of testing combine to substantially affect the inferences that can be drawn about applicants who pass or fail these tests. We suggest a statetrait framework for analyzing preemployment drug and alcohol tests, and note that the very weak and indirect links between the states being measured (e.g., recent drug use) and the trait inferences being drawn from the test (e.g., long-term suitability for employment) may limit the validity of these tests.  相似文献   

19.
20.
Best-worst scaling is a judgment format in which participants are presented with a set of items and have to choose the superior and inferior items in the set. Best-worst scaling generates a large quantity of information per judgment because each judgment allows for inferences about the rank value of all unjudged items. This property of best-worst scaling makes it a promising judgment format for research in psychology and natural language processing concerned with estimating the semantic properties of tens of thousands of words. A variety of different scoring algorithms have been devised in the previous literature on best-worst scaling. However, due to problems of computational efficiency, these scoring algorithms cannot be applied efficiently to cases in which thousands of items need to be scored. New algorithms are presented here for converting responses from best-worst scaling into item scores for thousands of items (many-item scoring problems). These scoring algorithms are validated through simulation and empirical experiments, and considerations related to noise, the underlying distribution of true values, and trial design are identified that can affect the relative quality of the derived item scores. The newly introduced scoring algorithms consistently outperformed scoring algorithms used in the previous literature on scoring many-item best-worst data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号