首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 156 毫秒
1.
标准参照测验及其等级线信度的概化理论分析   总被引:2,自引:1,他引:1  
在测量工作中,误用经典测验理论方法估计标准参照性测验的整体信度和等级线决策信度的情况非常突出。如,无论测量设计是交叉的还是嵌套的,也无论测验结果是做常模参照性解释,还是做标准参照性解释,测验工作者往往只报告克龙巴赫α系数或经典测验理论中的其它少数几个信度指标,而误把整体信度作为等级线信度的现象则更加普遍,这是十分不妥的。本文借用概化理论中的可靠性指数Φ和Φ(λ)公式,分别针对交叉设计和嵌套设计,就标准参照性测验的整体信度和等级分数线决策信度的估计问题进行了探讨。用数据演示的方法比较了交叉设计与嵌套设计在估计标准参照性测验整体信度方面的差异,展示了等级决策分数线决策信度的估计方法。  相似文献   

2.
标准参照测验中的信度估计公式   总被引:4,自引:0,他引:4  
陈希镇 《心理学报》1996,29(4):436-442
标准参照测验是与常模参照测验不同的一种测验,在标准参照测验中,一个人在测验上的分数不是和他人相比较而是和某个已经设定的标准作比较。如果测验是从某功课论域中随机抽样构造而成,则使用者希望知道考生在这份测验上的观测分数与其在该功课论域上的分数(假如已知)的接近程度;如果使用者想根据测验分数对考生作掌握分类,则他们关心这个推断与假设考生论域分数已知时所作推断一致程度有多高。本文对这两个问题的信度估计进行探讨,得到几个有用的估计公式。  相似文献   

3.
叶宝娟  温忠粦 《心理科学》2012,35(5):1213-1217
大量研究表明,一般情况下用合成信度可以较好地估计测验信度。对于合成信度及其置信区间的估计方法,在单维测验的情形已有不少研究。但罕有研究讨论多维测验合成信度的区间估计方法。本文用Delta法推导出计算多维测验合成信度的标准误公式,进而计算置信区间,并用一个例子说明如何编程估计多维测验合成信度及其置信区间。  相似文献   

4.
关丹丹  张厚粲 《心理科学》2004,27(2):445-448
本文首先对信度概念进行了明确,指出信度是评价测验结果可靠与否的一个指标,而不是测验工具的不变属性。针对测验结果的信度估计的可变性,介绍了上世纪末Vacha-Haase提出的信度概括化研究方法.即一种用来探索得分信度估计的可变性、并对引起变异的预测源进行探讨的一种元分析方法。最后通过对信度概括化研究手段的分析,指出信度概念的再认识与信度概括化研究将会给心理测验工作者带来新的启示。  相似文献   

5.
用多元概化理论对普通话的测试   总被引:5,自引:0,他引:5  
杨志明  张雷 《心理学报》2002,34(1):51-56
用多元概化理论 (MGT)研究了国家语委编制的普通话测验。在G研究中 ,利用香港人普通话测试的数据 ,估计了引起分数变异的各种来源的方差与协方差分量。在D研究中 ,首先估计了该测验 3个部分的全域分数和各自的概化系数等技术指标 ,然后估计了全域合成分数及其概化系数、信噪比等指标。结果表明 ,该测验的信度从总体上讲是较高的 ,把三个部分的全域分数进行合成也是合理的 ,但从细节上看其第 3部分的信度较低。另外 ,当评分者个数为 3、试题数量为 2 8时 ,测验的第 1、2部分的信度已经较高 ,因此 ,在实测时减少这两部分的题量并不会有太大问题  相似文献   

6.
α系数与测验的同质性   总被引:1,自引:0,他引:1  
刘红云 《心理科学》2008,31(1):185-188,176
本文从α系数与同质性测验、平行测验和基本τ-等价测验间的关系及三种测验间的关系入手,分析了α系数作为测验同质性信度估计的局限性;根据Jreskog给出的信度定义(α系数),讨论了λ系数与α一致性信度、Guttman 下限之间的关系,说明了在测验同质的前提下,λ系数在估计测验内部一致性时与α系数相比的优点.同时用模拟数据的方法就不同情景下测验的结构维度与α系数、Guttman λ2下限和λ系数之间的关系进行了探讨.  相似文献   

7.
随着考试事业的发展,标准参照测验(Criterion Referenced Test,CRT)也越来越多地受到人们的关注,但是它却陷入了用常模参照测验的方法来解释和报告分数的误区。该文从国内外重大标准参照测验CET-4&CET-6、HSK、GRE、CLEP等分数体系入手,通过对其分数体系的共同点分析,探讨出适合于标准参照测验的分数体系,最后指出目前一些测验的分数体系仍然存在的问题。  相似文献   

8.
追踪研究中测验工具的信度是衡量追踪研究质量的重要指标。传统的信度估计方法不适用于估计追踪研究的测验信度。近年来, 研究者提出了四种估计追踪研究的测验信度, 包括估计单个时间点的测验信度系数rw和r(Sw), 以及估计整个追踪研究的测验信度系数RT和RL。本文评述了这四种信度估计方法的数学模型、前提假设及其优缺点。RT和RL既可估计追踪研究中单个时间点的测验信度, 也可估计追踪研究中整个追踪研究的测验信度, 所需要的前提假设较少, 推荐同时使用RT和RL来估计追踪研究的测验信度。  相似文献   

9.
从a系数与同质性测验、平行测验和r-等价测验之间的关系,分析了a系数作为测验同质性信度估计的局限性;另外就a系数与测验维度问的关系和项目间的关系进行了讨论,对a系数在解释中常见的一些错误进行了说明。  相似文献   

10.
克雷佩林连续加算法的初步研究   总被引:1,自引:0,他引:1  
余凌  孔克勤 《心理科学》2004,27(2):350-352
本研究遵循克雷佩林连续加算法的基本原则,参照内田-克雷佩林心理测验的基本形式和操作方法.并加以改进.编制了一套作业法测验,并建立了数量化评定指标——相对作业量。通过对近2500名学生被试进行初步研究,该测验具有较好的信度和效度,以相对作业量为标准可以较客观地判定作业曲线的凹陷和凸起等形态特征。这对于连续加法计算作业在中国的推广应用以及教育、科研、心理咨询、职业指导等工作具有重要的意义。  相似文献   

11.
Assessing reliability of situational judgment tests (SJTs) in high‐stakes situations is problematic with reliability inappropriately measured by Cronbach's alpha when test items are heterogeneous. We computed the corrected, weighted mean alpha from 56 alpha coefficients, which produced a value of α = .46 and reviewed appropriate types of reliability to use with SJTs. In the current longitudinal study, SJT test–retest reliability was r = .82, compared with internal consistency, α = .46, and stratified alpha, α = .45 at Time 1 and α = .52 and stratified α = .51 at Time 2. We used a student sample (Time 1: n = 185; Time 2: n = 132) with items from a credentialing exam with ‘should do’ instructions. The SJT correlated significantly with cognitive ability, r = .30, and agreeableness, r = .24. In Study 2, we assessed test–retest reliability with Human Resource professionals (Time 1: n = 94; Time 2: n = 32) who had been recently credentialed and who participated in a pilot test of new SJT items with ‘most likely/least likely do’ response options. The SJT test–retest reliability was r = .66 compared with internal consistency, α = .43 and stratified α = .47 at Time 1 and α = .61 and stratified α = .67 at Time 2. We discuss the theoretical implications of the Study 1 results as well as the practical implications for use of SJTs in credentialing examinations.  相似文献   

12.
The quality of behavior analysis is of interest to many individuals within the community. Other professionals are including behavior analysis in their credentials and excluding from practice those qualified behavior analysts who do not have their credentials. Existing credentialing programs do not seem to regulate behavior analysis adequately. This article examines reasons for a professional credential in behavior analysis, various components of credentialing programs, the forms of programs available, and alternative professional credentials for behavior analysts.  相似文献   

13.
用多元概化理论考察大学生网络成瘾量表在大学生群体中应用的测量学性能。以随机测量模式的概化设计,针对1200名在校大学生进行问卷调查。结果显示双因子结构的相关程度在0.92以上,五因子结构的相关程度均在0.76~0.97间;整体量表的概化系数和可靠性指数均达到了0.94以上,而双因子结构各因子在0.90左右,五因子结构各因子均在0.74~0.85间。所以,整体量表及各因子在大学生群体中应用的信效度较高,可用作常模和标准参照测验;无论双因子还是五因子结构,CIAS-R各因子在分值比和项目数上,设计非常合理和完善。  相似文献   

14.
The number of training programs in forensic psychology has grown considerably in the past 15 years. Numerous opportunities exist for individuals interested in pursuing careers in the forensic area. Interdisciplinary training in several forms is discussed: J.D. -Ph.D. Programs, specialist Ph.D. Programs, predoctoral internships, and postdoctoral fellowships. Continuing professional education, credentialing, and board certification in forensic psychology are also addressed. Despite the face validity of these various types of training and credentialing, little is known about their relative utilities.  相似文献   

15.
Seventy‐one leaders in state, regional, and national professional and credentialing associations in counseling responded to a survey concerning professional advocacy efforts, resources, obstacles, and needs. The results indicate a variety of ongoing advocacy initiatives, specific needs for resources and interprofessional collaboration, and agreement on the importance of advocacy for the future of the profession.  相似文献   

16.
The certification of counselors has been an important professional issue for several years. Rehabilitation counselor certification was the first credentialing process established. This article describes its history and the current criteria for certification.  相似文献   

17.
Examinees who take credentialing tests and other types of high-stakes assessments are usually provided an opportunity to repeat the test if they are unsuccessful on initial attempts. To prevent examinees from obtaining unfair score increases by memorizing the content of specific test items, testing agencies usually assign an alternate form to repeat examinees. Given that the use of multiple forms presents both practical and psychometric challenges, it is important to determine if unwarranted score gains occur. Most research indicates that repeat examinees realize score gains when taking the same form twice; however, the research is far from conclusive, particularly within the context of credentialing. For the present investigations, two samples of repeat examinees were randomly assigned to receive either the same test form or a different, but parallel, form on the second occasion. Study 1 found score gains of about 0.79 SD units for 71 examinees who repeated a certification examination in computed tomography. Study 2 found gains of 0.48 SD units for 765 examinees who repeated a radiography certification examination. In both studies score gains for examinees receiving the parallel test were nearly indistinguishable from score gains for those who received the same test. Factors are identified that may influence the generalizability of these findings to other assessment contexts.  相似文献   

18.
A variety of problems have been experienced with psychological assessment of minority children. Traditional norm-referenced measurement has repeatedly received criticism concerning cultural unfairness or bias. Responses to such accusations primarily have been in the form of new instrumentation aimed at attaining a culture fair assessment. Little response has been evident from a conceptual standpoint addressing the issues of purpose and use of test results Although many have turned to criterion-referenced measurement as an answer to the problems of norm-referenced evaluation, cultural bias is not necessarily avoided in this framework either. Issues of who determines criteria and what those criteria include must be addressed if criterion-referenced measurement is to meet adequately the challenge of multicultural evaluation.  相似文献   

19.
The preparation and credentialing of marital and family therapists in the United States and Canada continues to be significantly affected by the role of accreditation in MFT graduate education. This report on a study of Commission on Accreditation for Marriage and Family Therapy Education accredited degree programs and non-accredited programs shows some significant differences between the two paths to preparation and credentialing. Accredited programs tend to have more faculty, lower faculty-student ratios, more Approved Supervisors, more financial aid, more programs requiring practica and internships, and more emphasis on professional identification with marital and family therapy. Nonaccredited programs provide more emphasis on psychopathology, psychodiagnostic testing, and cognitive behavioral therapy.  相似文献   

20.
A variety of problems have been experienced with psychological assessment of minority children. Traditional norm-referenced measurement has repeatedly received criticism concerning cultural unfairness or bias. Responses to such accusations primarily have been in the form of new instrumentation aimed at attaining a culture fair assessment. Little response has been evident from a conceptual standpoint addressing the issues of purpose and use of test results Although many have turned to criterion-referenced measurement as an answer to the problems of norm-referenced evaluation, cultural bias is not necessarily avoided in this framework either. Issues of who determines criteria and what those criteria include must be addressed if criterion-referenced measurement is to meet adequately the challenge of multicultural evaluation.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号