首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Randomization statistics offer alternatives to many of the statistical methods commonly used in behavior analysis and the psychological sciences, more generally. These methods are more flexible than conventional parametric and nonparametric statistical techniques in that they make no assumptions about the underlying distribution of outcome variables, are relatively robust when applied to small‐n data sets, and are generally applicable to between‐groups, within‐subjects, mixed, and single‐case research designs. In the present article, we first will provide a historical overview of randomization methods. Next, we will discuss the properties of randomization statistics that may make them particularly well suited for analysis of behavior‐analytic data. We will introduce readers to the major assumptions that undergird randomization methods, as well as some practical and computational considerations for their application. Finally, we will demonstrate how randomization statistics may be calculated for mixed and single‐case research designs. Throughout, we will direct readers toward resources that they may find useful in developing randomization tests for their own data.  相似文献   

2.
3.
4.
5.
The complexity of cognitive emulation of human diagnostic reasoning is the major challenge in the implementation of computer-based programs for diagnostic advice in medicine. We here present an epistemological model of diagnosis with the ultimate goal of defining a high-level language for cognitive and computational primitives. The diagnostic task proceeds through three different phases: hypotheses generation, hypotheses testing and hypotheses closure. Hypotheses generation has the inferential form of abduction (from findings to hypotheses) constrained under the criterion of plausibility. Hypotheses testing is achieved by a deductive inference (from generated hypotheses to expected findings), followed by an eliminative induction, constrained under the criterion of covering, which matches expected findings against patient's findings to select the best explanation. Hypotheses closure is a deductive-inductive type of inference very similar to the inferences operating in hypotheses testing. In this case induction matches the consequences of the generated hypotheses against the patient's characteristics or preferences under the criterion of utility. By using the language exploited in this epistemological model, it is possible to describe the cognitive tasks underlying the most influential knowledge-based diagnostic systems.  相似文献   

6.
A behavioral test was developed to assess the quality of diagnostic interviewing skills of (future) mental health professionals. Two aspects of diagnostic interviewing ability are distinguished: process skills, reflecting the interpersonal and communication skills; and content skills, referring to the information-gathering ability of the interviewer. It was found that diagnostic interviewing can be reliably measured with respect to interrater reliability. However, interviewer performance on one case proved to be a poor predictor of performance on other cases. It was concluded that a large number of cases is required to obtain reliable scores of general diagnostic interviewing ability. Validity was supported by the correlational analyses. Process skills were strongly related to patient satisfaction, whereas content skills were related to the amount of relevant information given by the patient and the accuracy of the diagnostic formulation and treatment plan.  相似文献   

7.
8.
Using 3 different samples, the authors assessed the incremental validity of situational judgment inventories (SJIs), relative to job knowledge, cognitive ability, job experience, and conscientiousness, in the prediction of job performance. The SJI was a valid predictor in all 3 samples and incrementally so in 2 samples. Relative to the other predictors, SJI's partial correlation with performance, controlling for the other 4 predictors, was superior in most comparisons. Subgroup differences on the SJI also appear to be less than those for cognitive ability and job knowledge, but greater than differences in conscientiousness. The SJI should prove to be a valuable additional measure in the prediction of job performance, but several additional areas of research are suggested.  相似文献   

9.
10.
This article considers procedures for combining individual probability distributions that belong to some “family” into a “group” probability distribution that belongs to the same family. The procedures considered are Vincentizing, in which quantiles are averaged across distributions; generalized Vincentizing, in which the quantiles are transformed before averaging; and pooling based on the distribution function or the probability density function. Some of these results are applied to models of reaction time in psychological experiments.  相似文献   

11.
12.
迫选(forced-choice,FC)测验由于可以控制传统李克特方法带来的反应偏差,被广泛应用于非认知测验中,而迫选测验的传统计分方式会产生自模式数据,这种数据由于不适合于个体间的比较,一直备受批评。近年来,多种迫选IRT模型的发展使研究者能够从迫选测验中获得接近常模性的数据,再次引起了研究者与实践人员对迫选IRT模型的兴趣。首先,依据所采纳的决策模型和题目反应模型对6种较为主流的迫选IRT模型进行分类和介绍。然后,从模型构建思路、参数估计方法两个角度对各模型进行比较与总结。其次,从参数不变性检验、计算机化自适应测验(computerized adaptive testing, CAT)和效度研究3个应用研究方面进行述评。最后提出未来研究可以在模型拓展、参数不变性检验、迫选CAT测验和效度研究4个方向深入。  相似文献   

13.
14.
15.
李佳  毛秀珍  张雪琴 《心理科学进展》2021,29(12):2272-2280
Q矩阵代表着项目考察的属性, 反映了项目的重要特征, 其正确性是影响认知诊断分类准确性的关键因素。研究Q矩阵估计(修正)方法具有重要价值。首先, 研究从是否采用认知诊断模型将Q矩阵估计(修正)分为基于认知诊断模型视角下的参数化方法和基于统计视角下的非参数方法。然后, 分别从最优项目质量、最优模型数据拟合和参数估计视角对它们进行分类介绍, 评析不同方法的特征和表现、区别与联系、优势与不足。最后, 提出几个未来研究问题:在复杂测验条件下系统比较各种方法; 校准知识状态和参数估计误差、结合多种思路和方法等多角度提出Q矩阵估计(修正)方法; 研究多级评分项目、混合测验模型、属性多级、属性个数未知甚至Q矩阵元素为连续变量等条件下的Q矩阵估计(修正)方法。  相似文献   

16.
Sophisticated electronic systems have been developed to measure speech activity patterns automatically, but their accuracy is unknown. The purpose of the present work is to evaluate the fidelity with which a class of computerized systems matches the measurements made by a human observer. With all parameters optimized, it was found that: (1) about 98% of samples (sampling rate = 200/sec) were classified the same way by the system and the criterion method; (2) distributions of utterance durations and speaker-switch intervals were accurately rendered by the automatic system; (3) average durations of talkspurts—bursts of speech activity within utterances—were closely approximated, but the system tended to overestimate the number of brief (?100 msec) talkspurts; (4) the system was subject to considerable qualitative error in the measurement of within-utterance pauses and of simultaneous talking.  相似文献   

17.
A general critical analysis of the median tests proposed by Wilson for certain analysis of variance hypotheses is presented. Specifically, discrepancies between the purported and actual approximate distributions of some of the test statistics are noted. Validity and power of the resulting tests are discussed.This work was sponsored in part by the Office of Naval Research while the author was at Stanford University. Reproduction in whole or in part is permitted for any purpose of the United States Government. The author wishes to thank Professors Fred C. Andrews, Lincoln E. Moses, and David L. Wallace for their helpful criticisms and suggestions in the writing of this paper.  相似文献   

18.
C. M. Steele and J. Aronson (1995) showed that making race salient when taking a difficult test affected the performance of high-ability African American students, a phenomenon they termed stereotype threat. The authors document that this research is widely misinterpreted in both popular and scholarly publications as showing that eliminating stereotype threat eliminates the African American-White difference in test performance. In fact, scores were statistically adjusted for differences in students' prior SAT performance, and thus, Steele and Aronson's findings actually showed that absent stereotype threat, the two groups differ to the degree that would be expected based on differences in prior SAT scores. The authors caution against interpreting the Steele and Aronson experiment as evidence that stereotype threat is the primary cause of African American-White differences in test performance.  相似文献   

19.
The analogue functional analysis described by Iwata, Dorsey, Slifer, Bauman, and Richman (1982/1994) identifies broad classes of variables (e.g., positive reinforcement) that maintain destructive behavior (Fisher, Ninness, Piazza, & Owen-DeSchryver, 1996). However, it is likely that some types of stimuli may be more effective reinforcers than others. In the current investigation, we identified 2 participants whose destructive behavior was maintained by attention. We used concurrent schedules of reinforcement to evaluate how different types of attention affected both destructive and appropriate behavior. We showed that for 1 participant praise was not an effective reinforcer when verbal reprimands were available; however, praise was an effective reinforcer when verbal reprimands were unavailable. For the 2nd participant, we identified a type of attention that effectively competed with verbal reprimands as reinforcement. We then used the information obtained from the assessments to develop effective treatments to reduce destructive behavior and increase an alternative communicative response.  相似文献   

20.
In the diagnostic evaluation of educational systems, self-reports are commonly used to collect data, both cognitive and orectic. For various reasons, in these self-reports, some of the students' data are frequently missing. The main goal of this research is to compare the performance of different imputation methods for missing data in the context of the evaluation of educational systems. On an empirical database of 5,000 subjects, 72 conditions were simulated: three levels of missing data, three types of loss mechanisms, and eight methods of imputation. The levels of missing data were 5%, 10%, and 20%. The loss mechanisms were set at: Missing completely at random, moderately conditioned, and strongly conditioned. The eight imputation methods used were: listwise deletion, replacement by the mean of the scale, by the item mean, the subject mean, the corrected subject mean, multiple regression, and Expectation-Maximization (EM) algorithm, with and without auxiliary variables. The results indicate that the recovery of the data is more accurate when using an appropriate combination of different methods of recovering lost data. When a case is incomplete, the mean of the subject works very well, whereas for completely lost data, multiple imputation with the EM algorithm is recommended. The use of this combination is especially recommended when data loss is greater and its loss mechanism is more conditioned. Lastly, the results are discussed, and some future lines of research are analyzed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号