首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 15 毫秒
Reliability captures the influence of error on a measurement and, in the classical setting, is defined as one minus the ratio of the error variance to the total variance. Laenen, Alonso, and Molenberghs (Psychometrika 73:443–448, 2007) proposed an axiomatic definition of reliability and introduced the R T coefficient, a measure of reliability extending the classical approach to a more general longitudinal scenario. The R T coefficient can be interpreted as the average reliability over different time points and can also be calculated for each time point separately. In this paper, we introduce a new and complementary measure, the so-called R Λ , which implies a new way of thinking about reliability. In a longitudinal context, each measurement brings additional knowledge and leads to more reliable information. The R Λ captures this intuitive idea and expresses the reliability of the entire longitudinal sequence, in contrast to an average or occasion-specific measure. We study the measure’s properties using both theoretical arguments and simulations, establish its connections with previous proposals, and elucidate its performance in a real case study. The authors are grateful to J&J PRD for kind permission to use their data. We gratefully acknowledge support from Belgian IUAP/PAI network “Statistical Techniques and Modeling for Complex Substantive Questions with Complex Data.”  相似文献   

采用2(组内变量:量尺大小(25分和9分))×2(组间变量:评分方法(相对和绝对))的混合实验设计探讨评分量表对115名大学生新手评委评分准确性的影响。对于评分准确性,采用Cronbach1955年提出的四个指标,Elevation(EL)、Differential elevation(DE)、Stereotype accuracy(SA)、Differential Accuracy(DA)。结果发现,评分方法只在SA上主效应显著,量尺大小在只在DA上主效应边缘显著,评分方法和量尺大小在DE、SA和DA三个指标上均有交互作用。总体上看,在结构化面试评分中,对于评分准确性,相对评分量表优于绝对评分量表,小量尺量表优于大量尺量表。  相似文献   

This study examined the treatment sensitivity of the ADHD Questionnaire (ADHD-Q), which is a brief rating scale for measuring symptoms of inattention, hyperactivity, and impulsivity in children. Parent, teacher, and child self-report data of the ADHD-Q were obtained for 17 clinically referred children with ADHD on the three occasions: (1) during the regular intake assessment, (2) just before the start of the stimulant medication (i.e., methylphenidate) intervention, and (3) four weeks after the start of the medication intervention. Results showed that ADHD-Q scores remained fairly stable in the period prior to the intervention, but then showed a substantial decline after the stimulant medication had been administered. Clearly, this finding supports the treatment sensitivity of the ADHD-Q.  相似文献   

追踪研究中测验工具的信度是衡量追踪研究质量的重要指标。传统的信度估计方法不适用于估计追踪研究的测验信度。近年来, 研究者提出了四种估计追踪研究的测验信度, 包括估计单个时间点的测验信度系数rw和r(Sw), 以及估计整个追踪研究的测验信度系数RT和RL。本文评述了这四种信度估计方法的数学模型、前提假设及其优缺点。RT和RL既可估计追踪研究中单个时间点的测验信度, 也可估计追踪研究中整个追踪研究的测验信度, 所需要的前提假设较少, 推荐同时使用RT和RL来估计追踪研究的测验信度。  相似文献   

The Conners Teacher Rating Scale (CTRS) is a commonly used research and clinical tool for assessing children's behavior in the classroom. The present study introduces the revised CTRS (CTRS-R) which improves on the original CTRS by (1) establishing normative data from a large, representative North American sample, (2) deriving a factor structure using advanced statistical techniques, and (3) updating the item content to reflect current conceptualizations of childhood disorders. Using confirmatory factor analysis, a six-factor structure was found which includes Hyperactivity-Impulsivity, Perfectionism, Inattention/Cognitive Problems, Social Problems, Oppositionality, and Anxious/Shy factors. The reliability of the scale, as measured by test-retest correlations and internal consistency, is generally satisfactory. Using all of the scale factors to discriminate between attention deficit hyperactivity disordered and normal children, 85 percent of children were correctly classified, supporting the validity of the scale and indicating excellent clinical utility. Similarities and differences between the original CTRS factor structure and the CTRS-R factor structure are discussed.  相似文献   

目前中介效应检验主要是基于截面数据,但许多时候截面数据的中介分析不适合进行因果推断,因而需要收集历时性的纵向数据,进行纵向数据的中介分析。评介了基于交叉滞后面板模型、多层线性模型和潜变量增长模型的纵向数据的中介分析方法及其四个发展。第一,中介效应随时间变化,如连续时间模型、多层时变系数模型。第二,中介效应随个体变化,如随机效应的交叉滞后面板模型和多层自回归模型。第三,中介模型的整合,如交叉滞后面板模型与多层线性模型整合为多层自回归模型。第四,中介检验方法的发展,建议使用Monte Carlo、Bootstrap和贝叶斯法进行纵向数据的中介分析。总结出一个纵向数据的中介分析流程并给出相应的Mplus程序。随后展望了纵向数据的中介分析的拓展方向。  相似文献   

Growth curve modeling is one of the main analytical approaches to study change over time. Growth curve models are commonly estimated in the linear and nonlinear mixed-effects modeling framework in which both the mean and person-specific curves are modeled parametrically with functions of time such as the linear, quadratic, and exponential. However, when more complex nonlinear trajectories need to be estimated and researchers do not have a priori knowledge of an appropriate functional form of growth, parametric models may be too restrictive. This paper reviews functional mixed-effects models, a nonparametric extension of mixed-effects models that permit both the mean and person-specific curves to be estimated without assuming a prespecified functional form of growth. Details of the model are presented along with results from a simulation study and an empirical example. The simulation study showed functional mixed-effects models performed reasonably well under various conditions commonly associated with longitudinal panel data, such as few time points per person, irregularly spaced time points across persons, missingness, and nonlinear trajectories. The usefulness of functional mixed-effects models is illustrated by analyzing empirical data from the Early Childhood Longitudinal Study – Kindergarten Class of 1998–1999.  相似文献   

This paper describes four studies on self-reported problems in 2,243 adolescent males and females, 12 to 17 years of age. In Study 1, principal-axis factoring of 102 items covering 11 problem domains revealed six factors comprising 49.5% of the variance. Study 2 used confirmatory factor analysis of a 64-item reduced set on a new sample of 408 adolescents. Goodness-of-fit indicators suggested that the six-factor model had excellent fit to the data. Study 3 used data from the 2,157 adolescents used in the first two studies. Coefficient alphas ranged from .83 to .92. Median test-retest reliability for the six factors was .86. There was a consistent structure of the correlation matrix across age and gender. Study 4 was a study of criterion validity, using an additional sample of 86 children with attention-deficit hyperactivity disorder (ADHD). Sensitivity and specificity were high, with an overall diagnostic efficiency of 83%. This new self-report scale, the Conners/Wells Adolescent Self-Report of Symptoms (CASS), may provide a useful component of a multimodal assessment of adolescent psychopathology.  相似文献   

Models for rankings have been shown to produce more efficient estimators than comparable models for first/top choices. The discussions and applications of these models typically only consider unordered alternatives. But these models can be usefully adapted to the case where a respondent ranks a set of ordered alternatives that are ordered response categories. This paper proposes eliciting a rank order that is consistent with the ordering of the response categories, and then modelling the observed rankings using a variant of the rank ordered logit model where the distribution of rankings has been truncated to the set of admissible rankings. This results in lower standard errors in comparison to when only a single top category is selected by the respondents. And the restrictions on the set of admissible rankings reduces the number of decisions needed to be made by respondents in comparison to ranking a set of unordered alternatives. Simulation studies and application examples featuring models based on a stereotype regression model and a rating scale item response model are provided to demonstrate the utility of this approach.  相似文献   

黎光明  蒋欢 《心理科学》2019,(3):731-738
包含评分者侧面的测验通常不符合任意一种概化理论设计,因此从概化理论的角度来看这类测验下的数据应属于缺失数据,而决定缺失结构的就是测验的评分方案。用R软件模拟出三种评分方案下的数据,并比较传统法、评价法和拆分法在各评分方案下的估计效果,结果表明:(1)传统法估计准确性较差;(2)评分者一致性较高时,适宜用评价法进行估计;(3)拆分法的估计结果最准确,仅在固定评分者评分方案下需注意评分者与考生数量之比,该比值小于等于0.0047 时估计结果较为准确。  相似文献   

The Adelaide-Conners Parent Rating Scale (APRS), an instrument developed by studying a large, representative group of schoolchildren, was used with a group of psychiatry attenders. Multimethod factor analysis found satisfactory agreement between the factor structures of the clinical and the normative groups. The patterns of scores on the 12 APRS scales were also compared. Two higher-order factors (Conflict with the Environment and Conflict within the Self) were identified in the clinical sample as previously found in the normative group. Comparison of the factor solutions with previous empirical efforts to identify parent-perceived patterns of child behavior disorder showed that the APRS compares well with other instruments and supports the strategy of proceeding from the study of normative populations to the study of clinically defined groups.This work was supported in part by a grant from the Department of Health, Canberra.  相似文献   

中文“教师效能感量表”的信、效度研究   总被引:8,自引:0,他引:8       下载免费PDF全文
本研究以405名中学教师为对象,对中文版教师效能感量表(TES)的心理测量学特征进行了探讨。结果表明中文版TES具有良好的信度、结构效度和同时效度,说明中文版TES是一个可以推广的量表,但在使用过程中应注意一些问题。研究还表明教龄、职称等老师特征变量对教师效能感不存在显著影响。  相似文献   

中国人人格形容词评定量表(QZPAS)的信度、效度与常模   总被引:24,自引:1,他引:23  
崔红  王登峰 《心理科学》2004,27(1):185-188
本研究的目的是建立中国人人格形容词评定量表(QZPAS)的信度、效度和常模。4000多名被试对由123个形容词组成的QZPAS的评定结果支持了中国人人格的“大七”模型,各因素有着良好的内部一致性和重测信度。自-他评定的相关以及自我总体评定与量表分数间的相关表明QZPAS有良好的效标效度。本研究所提供的常模也为该量表的应用提供了基础。  相似文献   

Sequential multiple assignment randomized trials (SMARTs) are a useful and increasingly popular approach for gathering information to inform the construction of adaptive interventions to treat psychological and behavioral health conditions. Until recently, analysis methods for data from SMART designs considered only a single measurement of the outcome of interest when comparing the efficacy of adaptive interventions. Lu et al. proposed a method for considering repeated outcome measurements to incorporate information about the longitudinal trajectory of change. While their proposed method can be applied to many kinds of outcome variables, they focused mainly on linear models for normally distributed outcomes. Practical guidelines and extensions are required to implement this methodology with other types of repeated outcome measures common in behavioral research. In this article, we discuss implementation of this method with repeated binary outcomes. We explain how to compare adaptive interventions in terms of various summaries of repeated binary outcome measures, including average outcome (area under the curve) and delayed effects. The method is illustrated using an empirical example from a SMART study to develop an adaptive intervention for engaging alcohol- and cocaine-dependent patients in treatment. Monte Carlo simulations are provided to demonstrate the good performance of the proposed technique.  相似文献   

问卷法是一种常见的实证研究方法。问卷数据建模的前期工作,就像是一栋大楼的奠基工程,基础是否扎实,影响后续的工程质量。本文专门讨论统计模型建立之前要做的事情(重点是量表评价),内容包括:处理缺失值、评价量表的结构效度和题目删除的适当性、多维量表需要合成总分时检验同质性并计算合成信度、检验共同方法偏差和评价(变量)区分效度、题目打包、检验自变量的多重共线性,最后也涉及建模理据和无关变量控制等。  相似文献   

Research studies in psychology and education often seek to detect changes or growth in an outcome over a duration of time. This research provides a solution to those interested in estimating latent traits from psychological measures that rely on human raters. Rater effects potentially degrade the quality of scores in constructed response and performance assessments. We develop an extension of the hierarchical rater model (HRM), which yields estimates of latent traits that have been corrected for individual rater bias and variability, for ratings that come from longitudinal designs. The parameterization, called the longitudinal HRM (L-HRM), includes an autoregressive time series process to permit serial dependence between latent traits at adjacent timepoints, as well as a parameter for overall growth. We evaluate and demonstrate the feasibility and performance of the L-HRM using simulation studies. Parameter recovery results reveal predictable amounts and patterns of bias and error for most parameters across conditions. An application to ratings from a study of character strength demonstrates the model. We discuss limitations and future research directions to improve the L-HRM.  相似文献   

We present a mixed-effects location scale model (MELSM) for examining the daily dynamics of affect in dyads. The MELSM includes person and time-varying variables to predict the location, or individual means, and the scale, or within-person variances. It also incorporates a submodel to account for between-person variances. The dyadic specification can accommodate individual and partner effects in both the location and the scale components, and allows random effects for all location and scale parameters. All covariances among the random effects, within and across the location and the scale are also estimated. These covariances offer new insights into the interplay of individual mean structures, intra-individual variability, and the influence of partner effects on such factors. To illustrate the model, we use data from 274 couples who provided daily ratings on their positive and negative emotions toward their relationship – up to 90 consecutive days. The model is fit using Hamiltonian Monte Carlo methods, and includes subsets of predictors in order to demonstrate the flexibility of this approach. We conclude with a discussion on the usefulness and the limitations of the MELSM for dyadic research.  相似文献   

为了深入探究中国人人格特质与结构,并编制本土化人格量表; 本文综合QZPS、CPAI-2及CPFFI的因素命名特征,编制出包含116个项目的人格词汇评定表。通过对1455名被试人格词汇评定结果的探索性分析,最终确定7个维度的人格词汇评定量表。该该量表7个因素对总变异的贡献率为51.63%,内部一致性信度在0.663-0.912之间,总量表的内部一致性信度为0.800; 7个因素的重测信度在0.700~0.874(p<0.001)之间。研究结果显示情绪性与外向性两个人格特质是人们共有的人格特质内容,该人格量表中的其他人格因素既有与西方人格因素趋同的方面,也有中国文化背景下本土化内容。同QZPS、CPAI-2及CPFFI相比,该人格因素结构几乎可以包含以上3个模型的绝大多数人格因素内容,且结构清晰全面。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号