期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

关丹丹张厚粲李中权《心理科学》2005,28(1):161-163

本文提出差异分数的信度变化问题,并以模拟数据分析了差异分数的信度在不同情况下的变化规律。结果指出：1．当两次测试得分的信度系数相等或相近时,两次测试的标准差相差越大,差异分数的信度越高。2．当两次测试得分的信度系数不等时,只要两次施测中任何一次的信度和标准差同时大于另外一次,那么差异分数的信度也比较高。3．无论两次测试的信度关系如何。两次测试相关越低,差异分数的信度越高。相似文献

2.

单维测验合成信度元分析

叶宝娟温忠粦胡竹菁《心理科学》2013,36(6):1464-1469

元分析是根据现有研究对感兴趣的主题得出比较准确和有代表性结论的一种重要方法,在心理、教育、管理、医学等社会科学研究中得到广泛应用。信度是衡量测验质量的重要指标,用合成信度能比较准确的估计测验信度。未见有文献提供合成信度元分析方法。本研究在比较对参数进行元分析的三种模型优劣的基础上,在变化系数模型下推出合成信度元分析点估计及区间估计的方法;以区间覆盖率为衡量指标,模拟研究表明本研究提出的合成信度元分析区间估计的方法得当;举例说明如何对单维测验的合成信度进行元分析。相似文献

3.

两种估计多维测验合成信度置信区间方法比较

杨强叶宝娟温忠麟《心理学探新》2014,(1):43-47,52

有两种方法可以估计多维测验合成信度的置信区间：Bootstrap法和Delta法.本文用模拟研究比较这两种方法,结果发现,Delta法与Bootstrap法得到结果的差异很小.因为Bootstrap法得到的是实证结果,通常被认为是真值的反映,而Delta法比Bootstrap法简单得多,所以可以用Delta法估计合成信度的置信区间.举例演示如何计算多维测验的合成信度以及用Delta法计算其置信区间. 相似文献

4.

单维测验合成信度三种区间估计的比较 总被引：3，自引：0，他引：3

叶宝娟温忠麟《心理学报》2011,43(4):453-461

已有许多研究建议使用合成信度来估计测验信度, 并报告其置信区间。有三种方法或途径可以计算单维测验合成信度的置信区间, 包括Bootstrap法、Delta法和直接用统计软件(如LISREL)输出的标准误进行计算。本文通过模拟研究进行比较, 发现Delta法与Bootstrap法得到的置信区间相当接近, 但用LISREL输出的标准误计算的与Bootstrap法得到的结果相差很大。推荐用Delta法估计合成信度的置信区间(使用Mplus容易实现), 但不能直接用LISREL输出的标准误来计算。举例说明了如何计算单维测验的合成信度以及用Delta法计算其置信区间。相似文献

5.

用Delta法估计多维测验合成信度的置信区间

叶宝娟温忠粦《心理科学》2012,35(5):1213-1217

大量研究表明,一般情况下用合成信度可以较好地估计测验信度。对于合成信度及其置信区间的估计方法,在单维测验的情形已有不少研究。但罕有研究讨论多维测验合成信度的区间估计方法。本文用Delta法推导出计算多维测验合成信度的标准误公式,进而计算置信区间,并用一个例子说明如何编程估计多维测验合成信度及其置信区间。相似文献

6.

通过Mplus计算几种常用的测验信度

王孟成叶宝娟《心理学探新》2014,(1):48-52

Cronbach＇s alpha系数作为信度估计指标存在诸多弊端.为了克服其不足,研究者提出了多种信度估计,而流行的统计软件尚未直接提供这些参数,以致在实践中并未被广泛采用.为了缩小理论和实践的差距,文章通过具体实例给出几种常用的信度估计（合成信度,单个指标信度和ωh）的Mplus程序. 相似文献

7.

测验信度估计：从α系数到内部一致性信度 总被引：5，自引：0，他引：5

温忠麟叶宝娟《心理学报》2011,43(7):821-829

沿用经典的测验信度定义, 简介了信度与a 系数的关系以及a系数的局限。为了推荐替代a系数的信度估计方法, 深入讨论了与a 系数关系密切的同质性信度和内部一致性信度。在很一般的条件下, 证明了a 系数和同质性信度都不超过内部一致性信度, 后者不超过测验信度, 说明内部一致性信度比较接近测验信度。总结出一个测验信度分析流程, 说明什么情况下a 系数还有参考价值; 什么情况下a 系数不再适用, 应当使用内部一致性信度(文献上也常称为合成信度)。提供了计算同质性信度和内部一致性信度的计算程序, 一般的应用工作者可以直接套用。相似文献

8.

中介效应检验方法的探新

李静赵必华《社会心理科学》2014,(7)

回顾中介变量的基本概念和Baron and Kenny提出的中介效应检验的基本方法,并对其理论观点进行新的思考,同时结合多种检验方法,介绍一种新的检验程序。该程序对中介效应的分类给出了新的提法;将间接效应a×b的显著性作为中介效应检验的前提条件,并推荐使用bootstrap法对其进行检验。相似文献

9.

用Delta法估计误差相关测验合成信度的置信区间:以FAD为例

叶宝娟杨强《心理学探新》2015,(3):251-256

诸多研究显示用合成信度可以较好地估计测验信度。文献上对合成信度置信区间估计的研究都假设题目测量误差不相关,而在实证研究中,也会遇到误差相关的情况,此时α系数往往高估测验信度,使用合成信度估计测验信度比较准确。本文给出用Delta法计算一般的单维测验合成信度的标准误公式,此公式无论测验误差是否相关都适用,据此可以计算合成信度的置信区间。通过对600名青少年调查发现,中文版FAD分测验"总的功能"的反向题测验误差存在相关,演示了如何估计此分测验的合成信度及其置信区间。相似文献

10.

标准参照测验中的信度估计公式 总被引：4，自引：0，他引：4

陈希镇《心理学报》1996,29(4):436-442

标准参照测验是与常模参照测验不同的一种测验,在标准参照测验中,一个人在测验上的分数不是和他人相比较而是和某个已经设定的标准作比较。如果测验是从某功课论域中随机抽样构造而成,则使用者希望知道考生在这份测验上的观测分数与其在该功课论域上的分数（假如已知）的接近程度;如果使用者想根据测验分数对考生作掌握分类,则他们关心这个推断与假设考生论域分数已知时所作推断一致程度有多高。本文对这两个问题的信度估计进行探讨,得到几个有用的估计公式。相似文献

11.

Issues of Reliability in Measuring Intimate Partner Violence during Courtship

Kathryn M. Ryan 《Sex roles》2013,69(3-4):131-148

The current paper focuses on problems in conceptualizing and establishing reliability when using self-administered measures of intimate partner violence to assess dating violence. Establishing reliability is an important step in the development of dating violence assessment instruments. However, the nature of dating violence can make it difficult to establish reliability. Most notably, measures of intimate partner violence in courtship yield data that are positively skewed, with almost no one reporting high levels of violence. This could have implications for the calculation of several forms of reliability that assume normality (e.g., Pearson correlations, intraclass correlations). In addition, there are other characteristics of dating violence that could impact reliability. For example, partner violence perpetrators do not necessarily use multiple acts (internal consistency reliability) or repeat specific acts (test-retest reliability). And, gender differences in the perception of partner violence may influence intra-couple reliability in heterosexual couples. Finally, statistical interdependence within couples makes current intra-couple reliability assessment suspect. Research is reviewed and recommendations are made concerning the establishment of test-retest reliability, intra-couple reliability, and internal consistency reliability for measures of dating violence. 相似文献

12.

追踪研究中测验信度的估计

叶宝娟温忠麟陈启山《心理科学进展》2012,20(3):467-474

追踪研究中测验工具的信度是衡量追踪研究质量的重要指标。传统的信度估计方法不适用于估计追踪研究的测验信度。近年来, 研究者提出了四种估计追踪研究的测验信度, 包括估计单个时间点的测验信度系数rw和r(Sw), 以及估计整个追踪研究的测验信度系数RT和RL。本文评述了这四种信度估计方法的数学模型、前提假设及其优缺点。RT和RL既可估计追踪研究中单个时间点的测验信度, 也可估计追踪研究中整个追踪研究的测验信度, 所需要的前提假设较少, 推荐同时使用RT和RL来估计追踪研究的测验信度。相似文献

13.

The content reliability of a test

Harold Gulliksen 《Psychometrika》1936,1(3):189-194

The content unreliability of an essay test is the error due to the items used or the content of the test. The reader unreliability is due to variation in judgment of the persons who read and score the essay test. The content reliability of an essay test is accordingly defined as being independent of the reader reliability. Formulae are derived for the reader reliability and for the content reliability. The content reliability is found to be equal to the geometric mean of the test reliabilities computed from the scores assigned by the two readers, divided by the reader reliability. 相似文献

14.

Inter-rater and test-retest reliability of the Revised Diagnostic Interview for Borderlines 总被引：1，自引：0，他引：1

Zanarini MC Frankenburg FR Vujanovic AA 《Journal of personality disorders》2002,16(3):270-276

The baseline inter-rater reliability, test-retest reliability, follow-up inter-rater reliability, and follow-up longitudinal reliability (interrater reliability between generations of raters) of borderline symptoms and the diagnosis of borderline personality disorder (BPD) were assessed using the Revised Diagnostic Interview for Borderlines (DIB-R). Excellent kappa s (> .75) were found in each of these reliability substudies for the diagnosis of BPD itself. Excellent kappa s were also found in each of the three inter-rater reliability substudies for the vast majority of borderline symptoms assessed by the DIB-R. Test-retest reliability for these symptoms was somewhat lower but still very good. More specifically, one-third of the BPD symptoms assessed had a kappa in the excellent range and the remaining two-thirds had a kappa in the fair-good range (.57-.73). The dimensional reliability of BPD symptom areas was somewhat higher than for categorical measures of the subsyndromal phenomenology of BPD. More specifically, all five dimensional measures of borderline psychopathology had intraclass correlation coefficients in the excellent range for all four reliability substudies. Taken together, the results of this study suggest that both the borderline diagnosis and the symptoms of BPD can be diagnosed reliably when using the DIB-R. They also suggest that excellent reliability, once achieved, can be maintained over time for both the syndromal and subsyndromal phenomenology of BPD. 相似文献

15.

Considerations in the choice of interobserver reliability estimates

Hartmann DP 《Journal of applied behavior analysis》1977,10(1):103-116

Two types of interobserver reliability values may be needed in treatment studies in which observers constitute the primary data-acquisition system: trial reliability and the reliability of the composite unit or score which is subsequently analyzed, e.g., daily or weekly session totals. Two approaches to determining interobserver reliability are described: percentage agreement and "correlational" measures of reliability. The interpretation of these estimates, factors affecting their magnitude, and the advantages and limitations of each approach are presented. 相似文献

16.

Psychometric evaluation of trauma and posttraumatic stress disorder assessments in persons with severe mental illness

Mueser KT Salyers MP Rosenberg SD Ford JD Fox L Carty P 《心理评价》2001,13(1):110-117

Interrater reliability, internal consistency, test-retest reliability, and convergent validity were examined for the Trauma History Questionnaire (THQ), the Clinician-Administered Posttraumatic Stress Disorder (PTSD) Scale (CAPS), and the PTSD Checklist (PCL) in 30 clients with severe mental illnesses. Interrater reliability for the THQ and CAPS was high, as was internal consistency of CAPS and PCL subscales. The test-retest reliability of the THQ was moderate to high for different traumas. PTSD diagnoses on the CAPS and PCL showed moderate test-retest reliability. Lower levels of test-retest reliability for PTSD diagnoses were related to psychosis diagnoses and symptoms. However, when more stringent criteria for PTSD were used on the CAPS, it had excellent test-retest reliability across all clients. CAPS and PCL diagnoses of PTSD showed moderate convergent validity. The results support the reliability of trauma and PTSD assessments in clients with severe mental illness. 相似文献

17.

A Breakdown of Reliability Coefficients by Test Type and Reliability Method,and the Clinical Implications of Low Reliability

Richard A. Charter 《The Journal of general psychology》2013,140(3):290-304

相似文献

18.

信度的再认识与信度概括化研究

关丹丹张厚粲《心理科学》2004,27(2):445-448

本文首先对信度概念进行了明确,指出信度是评价测验结果可靠与否的一个指标,而不是测验工具的不变属性。针对测验结果的信度估计的可变性,介绍了上世纪末Vacha-Haase提出的信度概括化研究方法．即一种用来探索得分信度估计的可变性、并对引起变异的预测源进行探讨的一种元分析方法。最后通过对信度概括化研究手段的分析,指出信度概念的再认识与信度概括化研究将会给心理测验工作者带来新的启示。相似文献