首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 109 毫秒
1.
Guttman's principal components for the weighting system are the item scoring weights that maximize the generalized Kuder-Richardson reliability coefficient. The principal component for any item is effectively the same as the factor loading of the item divided by the item standard deviation, the factor loadings being obtained from an ordinary factor analysis of the item intercorrelation matrix.  相似文献   

2.
It is pointed out that the scoring weights for test items should be approximations to regression-equation weights. For this reason any estimate of reliability of the weight should not be permitted to influence the size of the weight but should be used in determining the limit of acceptability of an item. A simple approximation weight is recommended for general use, and anabac is provided for the estimation of it when the correlation between item and criterion is the phi coefficient. A formula for the standard error of this weight is derived and tables of significant and very significant weights are presented in terms of deviations from the median weight.  相似文献   

3.
属性不等权重的多级评分属性层级方法   总被引:1,自引:1,他引:0  
本文给出基于属性不等权重的等级反应模型(Grade Response Model, GRM)的属性层级方法(Attribute Hierarchy Method, AHM), 简记为属性不等权重的GRM-AHM。在属性层级结构下, 本文利用贝叶斯网与最小二乘两种方法, 提出了被试掌握属性的条件概率与属性权重的计算方法, 发现并解决了属性在不同的项目内权重有可能不相等的问题。本研究进一步将认知诊断推广到多级评分的情形。试验证明, 属性不等权重的GRM-AHM具有较高的判准率。  相似文献   

4.
A formula for internal consistency reliability is developed within the framework of the analysis of variance. The test items are assumed to be homogeneous, but may have any weights. Data needed for computation are the student test scores, and the total number of items answered so as to have the same weight. It is shown that this formula reduces to the Kuder-Richardson (21) for item weights of one and zero. Some empirical validation is offered.Social Security Board (on military leave.)  相似文献   

5.
A hybrid procedure for number correct scoring is proposed. The proposed scoring procedure is based on both classical true-score theory (CTT) and multidimensional item response theory (MIRT). Specifically, the hybrid scoring procedure uses test item weights based on MIRT and the total test scores are computed based on CTT. Thus, what makes the hybrid scoring method attractive is that this method accounts for the dimensionality of the test items while test scores remain easy to compute. Further, the hybrid scoring does not require large sample sizes once the item parameters are known. Monte Carlo techniques were used to compare and contrast the proposed hybrid scoring method with three other scoring procedures. Results indicated that all scoring methods in this study generated estimated and true scores that were highly correlated. However, the hybrid scoring procedure had significantly smaller error variances between the estimated and true scores relative to the other procedures.  相似文献   

6.
The linear logistic test model (LLTM) specifies the item parameters as a weighted sum of basic parameters. The LLTM is a special case of a more general nonlinear logistic test model (NLTM) where the weights are partially unknown. This paper is about the identifiability of the NLTM. Sufficient and necessary conditions for global identifiability are presented for a NLTM where the weights are linear functions, while conditions for local identifiability are shown to require a model with less restrictions. It is also discussed how these conditions are checked using an algorithm due to Bekker, Merckens, and Wansbeek (1994). Several illustrations are given.This article was written while the first author was a post doctoral fellow at the university of Twente. He gratefully acknowledges the university's hospitality and the financial support by NWO (project nr. 30002).  相似文献   

7.
In covariance structure modelling, the non‐centrality parameter of the asymptotic chi‐squared distribution is typically used as an indicator of asymptotic power for hypothesis tests. When a latent linear regression is of interest, the contribution to power by the maximal reliability coefficient, which is associated with used latent variable indicators, is examined and this relationship is further explicated in the case of congeneric measures. It is also shown that item parcelling may reduce power of tests of latent regression parameters. Recommendations on weights for parcelling to avoid power loss are provided, which are found to be those of optimal linear composites with maximal reliability.  相似文献   

8.
Tests or personal inventories with differential item response weights may be scored by means of punch card equipment. Detailed instructions are given for preparing the cards and scoring the forms. The scoring speed is approximately four to eight times that attained by manual scoring.The author is indebted to H. M. Cox, University of Nebraska, for certain suggestions on the columnar coding and the method of checking herein presented.  相似文献   

9.
A method is proposed for constructing indices as linear functions of variables such that the reliability of the compound score is maximized. Reliability is defined in the framework of latent variable modeling [i.e., item response theory (IRT)] and optimal weights of the components of the index are found by maximizing the posterior variance relative to the total latent variable variance. Three methods for estimating the weights are proposed. The first is a likelihood-based approach, that is, marginal maximum likelihood (MML). The other two are Bayesian approaches based on Markov chain Monte Carlo (MCMC) computational methods. One is based on an augmented Gibbs sampler specifically targeted at IRT, and the other is based on a general purpose Gibbs sampler such as implemented in OpenBugs and Jags. Simulation studies are presented to demonstrate the procedure and to compare the three methods. Results are very similar, so practitioners may be suggested the use of the easily accessible latter method. A real-data set pertaining to the 28-joint Disease Activity Score is used to show how the methods can be applied in a complex measurement situation with multiple time points and mixed data formats.  相似文献   

10.
Alternative weights and invariant parameters in optimal scaling   总被引:1,自引:0,他引:1  
Under conditions that are commonly satisfied in optimal scaling problems, arbitrary sets of optimal weights can be obtained by choices of generalized inverse procedures. A simple relationship holds between these and the corresponding invariant item scores. The case of optimal scaling originally treated by Guttman [1941] yields a restricted form of multicategory factor analysis. It is suggested that the invariant parameters of optimal scaling should be interpreted, according to the principles of latent trait theory, rather than the arbitrary weights.This paper benefits from a number of suggestions and comments made by Professors M. J. R. Healy, H. Goldstein, and S. Nishisato, and by Mr. C. Fraser, to whom grateful acknowledgments are due. The author is solely responsible for the final form of the paper, including of course such errors as may remain in it.This research was partly supported by Grant No. A6346 from the Natural Sciences and Engineering Research Council of Canada.  相似文献   

11.
Millon Clinical Multiaxial Inventory II (MCMI-II; Millon, 1987) results from 134 patients were scored twice; with and without the item weights. The results showed that the correlations between the weighted and unweighted versions of the same scales were extremely high, exceeding .90 in all cases. Furthermore, weighting did not significantly reduce the correlations among the scales, either within each of the four syndrome/pattern categories of the MCMI-II, or between categories. It is concluded that item weighting reduces the access of the MCMI-II by clinicians, without increasing its psychometric properties.  相似文献   

12.
Forgetting curves: implications for connectionist models   总被引:4,自引:0,他引:4  
Forgetting in long-term memory, as measured in a recall or a recognition test, is faster for items encoded more recently than for items encoded earlier. Data on forgetting curves fit a power function well. In contrast, many connectionist models predict either exponential decay or completely flat forgetting curves. This paper suggests a connectionist model to account for power-function forgetting curves by using bounded weights and by generating the learning rates from a monotonically decreasing function. The bounded weights introduce exponential forgetting in each weight and a power-function forgetting results when weights with different learning rates are averaged. It is argued that these assumptions are biologically reasonable. Therefore power-function forgetting curves are a property that may be expected from biological networks. The model has an analytic solution, which is a good approximation of a power function displaced one lag in time. This function fits better than any of the 105 suggested two-parameter forgetting-curve functions when tested on the most precise recognition memory data set collected by. Unlike the power-function normally used, the suggested function is defined at lag zero. Several functions for generating learning rates with a finite integral yield power-function forgetting curves; however, the type of function influences the rate of forgetting. It is shown that power-function forgetting curves cannot be accounted for by variability in performance between subjects because it requires a distribution of performance that is not found in empirical data. An extension of the model accounts for intersecting forgetting curves found in massed and spaced repetitions. The model can also be extended to account for a faster forgetting rate in item recognition (IR) compared to associative recognition in short but not long retention intervals.  相似文献   

13.
The parent, teacher, and clinician forms of the IJR Behavior Checklist yield four summary scores (checklist total score, pathology weighted total score, mean pathology score, and highest 5 items' mean pathology score). The checklist total score is essentially a symptom count with a double-weighting for frequent/intense occurrence; the other three scores incorporate item pathology weights. All four summary scores were shown to have moderately high validity as measures of the construct child/ adolescent psychopathology: They differentiated between well-adjusted and clinical subsamples at the .001 level. The two scores based entirely on the pathology weights manifested less satisfactory reliability than the two scores reflecting primarily number of symptoms, but surpassed the latter in power to discriminate between psychotic and nonpsychotic patients.  相似文献   

14.
李中权  王力  张厚粲  周仁来 《心理学报》2011,43(9):1087-1094
理解项目难度变异的来源是实现计算机自动化项目生成的第一步。通过文献综述, 总结出影响图形推理测验项目难度的四个方面的因素, 再通过操控构图元素熟悉性、属性的抽象性、知觉组织的和谐性以及规则类型与数目这些因素, 编制8套图形推理测验, 共包含112个与高级瑞文推理类似的项目。采用铆测验等值设计, 在每套测验中嵌入10个高级瑞文推理测验项目为铆题, 通过网络施测于6323名被试。使用BILOG MG估算项目参数, 并使用IRTEQ进行测验等值, 将后七套测验上所有项目的项目参数都转换到第一套测验的单位系统上。以项目难度为因变量, 项目题干特征变量为预测变量进行回归分析, 结果发现这四个因素均对项目难度有显著预测作用。优势分析的结果显示记忆负荷(即规则类型与数目的组合)是项目难度的最重要的预测变量, 其他依次为属性的抽象性、知觉组织的和谐性和构图元素熟悉性。  相似文献   

15.
While the Angoff (1971) is a commonly used cut score method, critics ( Berk, 1996; Impara & Plake, 1997 ) argue the Angoff places too‐high cognitive demands on raters. In response to criticisms of the Angoff, a number of modifications to the method have been proposed. Some suggested Angoff modifications include using an iterative rating process, presenting judges with normative data about item performance, revising the rating judgment into a Yes/No decision, assigning relative weights to dimensions within a test, and using item response theory in setting cut scores. In this study, subject matter expert raters were provided with a ‘difficulty anchored’ rating scale to use while making Angoff ratings; this scale can be viewed as a variation of the Angoff normative data modification. The rating scale presented test items having known p‐values as anchors, and served as a simple means of providing normative information to guide the Angoff rating process. Results are discussed regarding reliability of the mean Angoff rating (.73) and the correlation of mean Angoff ratings with item difficulty (observed r ranges from .65 to .73).  相似文献   

16.
This paper elaborates a recent conceptualization of feature-based attention in terms of attention filters (Drew et al., Journal of Vision, 10(10:20), 1–16, 2010) into a general purpose centroid-estimation paradigm for studying feature-based attention. An attention filter is a brain process, initiated by a participant in the context of a task requiring feature-based attention, which operates broadly across space to modulate the relative effectiveness with which different features in the retinal input influence performance. This paper describes an empirical method for quantitatively measuring attention filters. The method uses a “statistical summary representation” (SSR) task in which the participant strives to mouse-click the centroid of a briefly flashed cloud composed of items of different types (e.g., dots of different luminances or sizes), weighting some types of items more strongly than others. In different attention conditions, the target weights for different item types in the centroid task are varied. The actual weights exerted on the participant’s responses by different item types in any given attention condition are derived by simple linear regression. Because, on each trial, the centroid paradigm obtains information about the relative effectiveness of all the features in the display, both target and distractor features, and because the participant’s response is a continuous variable in each of two dimensions (versus a simple binary choice as in most previous paradigms), it is remarkably powerful. The number of trials required to estimate an attention filter is an order of magnitude fewer than the number required to investigate much simpler concepts in typical psychophysical attention paradigms.  相似文献   

17.
联结记忆由三种成分构成:项目1, 项目2以及项目1-项目2之间的联结, 其中, 对项目1和项目2的再认称之为项目再认, 而对项目1-项目2之间联结的再认称之为联结再认。双加工理论认为项目再认可以由熟悉性和回想加工来完成, 而联结再认只能由回想加工来完成。但近期有大量的研究发现:当要学习的项目对被整合为一个新的整体表征时, 熟悉性也能够支持联结再认。而关于整合对联结记忆中项目再认的研究较少, 总结已有研究提出两种观点:一种是“只有受益”观点(benefits-only)认为整合在增加联结再认的同时不影响项目再认; 另一种是“收支平衡”观点(costs and benefits)认为整合增加联结再认是以牺牲项目再认为代价的。未来研究应该关注整合对联结记忆中项目再认的影响及其神经机制, 了解整合对联结再认和项目再认的具体作用, 有助于针对具体记忆任务选择合适的编码方式来提高记忆表现。  相似文献   

18.
汪文义  丁树良 《心理科学》2012,35(2):452-456
目前已有研究证明可达阵在认知诊断测验编制中起重要作用,但迄今为止并没有引起普遍注意。本文主要讨论当题库缺少某些可达阵对应的项目类,对原始题的属性向量在线标定的准确性的影响。本文对含6个属性的独立型结构进行了模拟试验,结果显示:如果题库不充要,原始题的属性标定准确性受到影响,题库中非可达阵中项目对标定有一定的弥补作用。间接印证了可达阵在认知诊断题库起到非常重要的作用。  相似文献   

19.
It is common practice in IRT to consider items as fixed and persons as random. Both, continuous and categorical person parameters are most often random variables, whereas for items only continuous parameters are used and they are commonly of the fixed type, although exceptions occur. It is shown in the present article that random item parameters make sense theoretically, and that in practice the random item approach is promising to handle several issues, such as the measurement of persons, the explanation of item difficulties, and trouble shooting with respect to DIF. In correspondence with these issues, three parts are included. All three rely on the Rasch model as the simplest model to study, and the same data set is used for all applications. First, it is shown that the Rasch model with fixed persons and random items is an interesting measurement model, both, in theory, and for its goodness of fit. Second, the linear logistic test model with an error term is introduced, so that the explanation of the item difficulties based on the item properties does not need to be perfect. Finally, two more models are presented: the random item profile model (RIP) and the random item mixture model (RIM). In the RIP, DIF is not considered a discrete phenomenon, and when a robust regression approach based on the RIP difficulties is applied, quite good DIF identification results are obtained. In the RIM, no prior anchor sets are defined, but instead a latent DIF class of items is used, so that posterior anchoring is realized (anchoring based on the item mixture). It is shown that both approaches are promising for the identification of DIF.  相似文献   

20.
A promising approach to understanding the processes involved when subjects respond to personality items is provided by the investigation of the causes of inconsistent responses when subjects answer the same item on two occasions. Among these causes are the properties of the item. Previous item research focused almost exclusively on properties which are not highly specific to the item, such as endorsement rate (ER) and social desirability scale value (SDSV). Although past studies found that items with ‘extreme’ SDSVs and/or ERs elicit fewer inconsistencies, these studies ignored more item-specific properties such as item content and item ambiguity. The present study demonstrates that contrary results regarding consistency may be obtained when more item-specific properties are taken into consideration. These results are interpreted as evidence that certain kinds of item content can increase the indecision and conflict that characterize some subjects' response processes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号