首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 7 毫秒
1.
Computer-controlled experiments present a number of potential sources of error in timing the presentation of events, including video refresh rate, keyboard scanning rate, and disk I/O times. A terminate-and-stay-resident routine implementing multiple millisecond-accuracy timers is presented. Interfaces permitting use of the timer with several higher level languages (C, FORTRAN , Pascal, and QuickBASIC) are described, as is a design for a two- to four-button response system using the computer’s printer port. A general strategy is described for using multiple timers to control and measure variation in critical experimental events. A C language program is provided to benchmark variation in the time required to perform common experimental tasks (screen refreshing, switching video pages, disk I/O, loop calculations), and results are summarized for several representative computers that use the IBM design.  相似文献   

2.
A procedure is proposed for approximating attained significance levels of exact conditional tests. The procedure utilizes a sampling from the null distribution of tables having the same marginal frequencies as the observed table. Application of the approximation through a computer subroutine yields precise approximations for practically any table dimensions and sample size.  相似文献   

3.
Statistical significance tests are derived and evaluated for measuring apparent differences between an obtained and an expected binormal ROC curve, between two independent binormal ROC curves, and among groups of independent binormal ROC curves. A binormal ROC curve is described by two parameters which represent the spread of the means and the ratio of the standard deviations of the two underlying Gaussian decision variable distributions. To test the significance of apparent differences between or among ROC curves, approximate χ2 statistics for each of the three tests were constructed from maximum likelihood estimates of the two parameters defining the binormal ROC curve. The performance of each test statistic was evaluated by simulating five-category rating scale data with equal numbers of noise and signal-plus-noise trials (set at 50, 250, and 500) for each of three typical ROC curves. For the significance test involving only one ROC curve, rating scale data were generated from the chance diagonal of the ROC space also. Although test performance was found to be somewhat dependent on the number of trials and on the location of the ROC curve in the ROC space, comparisons of the obtained and expected fractions of (falsely) significant results at various α levels showed the proposed statistical significance tests to be reliable under practical experimental conditions.  相似文献   

4.
刘彦楼 《心理学报》2022,54(6):703-724
认知诊断模型的标准误(Standard Error, SE; 或方差—协方差矩阵)与置信区间(Confidence Interval, CI)在模型参数估计不确定性的度量、项目功能差异检验、项目水平上的模型比较、Q矩阵检验以及探索属性层级关系等领域有重要的理论与实践价值。本研究提出了两种新的SE和CI计算方法:并行参数化自助法和并行非参数化自助法。模拟研究发现:模型完全正确设定时, 在高质量及中等质量项目条件下, 这两种方法在计算模型参数的SE和CI时均有好的表现; 模型参数存在冗余时, 在高质量及中等质量项目条件下, 对于大部分允许存在的模型参数而言, 其SE和CI有好的表现。通过实证数据展示了新方法的价值及计算效率提升效果。  相似文献   

5.
Two small-sample tests (proposed by Tate and Clelland and by Chapanis respectively) of hypotheses about the parameters of the multinomial distribution, where
$$f(x|p) = n!\prod\limits_{i = 1}^k {\frac{{p_i^{x_i } }}{{x_i !}}} $$  相似文献   

6.
We propose a simple modification of Hochberg's step‐up Bonferroni procedure for multiple tests of significance. The proposed procedure is always more powerful than Hochberg's procedure for more than two tests, and is more powerful than Hommel's procedure for three and four tests. A numerical analysis of the new procedure indicates that its Type I error is controlled under independence of the test statistics, at a level equal to or just below the nominal Type I error. Examination of various non‐null configurations of hypotheses shows that the modified procedure has a power advantage over Hochberg's procedure which increases in relationship to the number of false hypotheses.  相似文献   

7.
The kappa agreement coefficient of Cohen from 1960 and Brennan and Prediger from 1981 are defined and compared. A FORTRAN program is described that computes Cohen's kappa and Brennan and Prediger's kappa and their associated probability values based on Monte Carlo resampling and the binomial distribution, respectively.  相似文献   

8.
迫选(forced-choice,FC)测验由于可以控制传统李克特方法带来的反应偏差,被广泛应用于非认知测验中,而迫选测验的传统计分方式会产生自模式数据,这种数据由于不适合于个体间的比较,一直备受批评。近年来,多种迫选IRT模型的发展使研究者能够从迫选测验中获得接近常模性的数据,再次引起了研究者与实践人员对迫选IRT模型的兴趣。首先,依据所采纳的决策模型和题目反应模型对6种较为主流的迫选IRT模型进行分类和介绍。然后,从模型构建思路、参数估计方法两个角度对各模型进行比较与总结。其次,从参数不变性检验、计算机化自适应测验(computerized adaptive testing, CAT)和效度研究3个应用研究方面进行述评。最后提出未来研究可以在模型拓展、参数不变性检验、迫选CAT测验和效度研究4个方向深入。  相似文献   

9.
Emotional intelligence (EI) has attracted considerable interest amongst both individual differences researchers and those in other areas of psychology who are interested in how EI relates to criteria such as well‐being and career success. Both trait (self‐report) and ability EI measures have been developed; the focus of this paper is on ability EI. The associations of two new ability EI tests with psychometric intelligence, emotion perception, and the Mayer–Salovey–Caruso EI test (MSCEIT) were examined. The new EI tests were the Situational Test of Emotion Management (STEM) and the Situational Test of Emotional Understanding (STEU). Only the STEU and the MSCEIT Understanding Emotions branch were significantly correlated with psychometric intelligence, suggesting that only understanding emotions can be regarded as a candidate new intelligence component. These understanding emotions tests were also positively correlated with emotion perception tests, and STEM and STEU scores were positively correlated with MSCEIT total score and most branch scores. Neither the STEM nor the STEU were significantly correlated with trait EI tests, confirming the distinctness of trait and ability EI. Taking the present results as a starting‐point, approaches to the development of new ability EI tests and models of EI are suggested.  相似文献   

10.
11.
A method of the IRT observed-score equating using chain equating through a third test without equating coefficients is presented with the assumption of the three-parameter logistic model. The asymptotic standard errors of the equated scores by this method are obtained using the results given by M. Liou and P.E. Cheng. The asymptotic standard errors of the IRT observed-score equating method using a synthetic examinee group with equating coefficients, which is a currently used method, are also provided. Numerical examples show that the standard errors by these observed-score equating methods are similar to those by the corresponding true score equating methods except in the range of low scores.The author is indebted to Michael J. Kolen for access to the real data used in this article and anonymous reviewers for their corrections and suggestions on this work.  相似文献   

12.
Two forces motivate this special section, "New Methods for New Questions in Developmental Psychology." First are recent developments in social science methodology and the increasing availability of those methods in common software packages. Second, at the same time psychologists' understanding of developmental phenomena has continued to grow. At their best, these developments in theory and methods work in tandem, fueling each other. Newer methods make it possible for scientists to better test their ideas; better ideas lead methodologists to techniques that better reflect, capture, and quantify the underlying processes. The articles in this special section represent a sampling of these new methods and new questions. The authors describe common themes in these articles and identify barriers to future progress, such as the lack of data sharing by and analytical training for developmentalists.  相似文献   

13.
In the vast majority of psychological research utilizing multiple regression analysis, asymptotic probability values are reported. This paper demonstrates that asymptotic estimates of standard errors provided by multiple regression are not always accurate. A resampling permutation procedure is used to estimate the standard errors. In some cases the results differ substantially from the traditional least squares regression estimates.  相似文献   

14.
Clinical significance methods: a comparison of statistical techniques   总被引:6,自引:0,他引:6  
Clinically significant change refers to meaningful change in individual patient functioning during psychotherapy. Following the operational definition of clinically significant change offered by Jacobson, Follette, and Revenstorf (1984), several alternatives have been proposed because they were thought to be either more accurate or more sensitive to detecting meaningful change. In this study, we compared five methods using a sample of 386 outpatients who underwent treatment in routine clinical practice. Differences were found between methods, suggesting that the statistical method used to calculate clinical significance has an effect on estimates of meaningful change. The Jacobson method (Jacobson & Truax, 1991) provided a moderate estimate of treatment effects and was recommended for use in outcome studies and research on clinically significant change, but future research is needed to validate this statistical method.  相似文献   

15.
A coefficient derived from communalities of test parts has been proposed as greatest lower bound to Guttman's immediate retest reliability. The communalities have at times been calculated from covariances between itemsets, which tends to underestimate appreciably. When items are experimentally independent, a consistent estimate of the greatest defensible internal-consistency coefficient is obtained by factoring item covariances. In samples of modest size, this analysis capitalizes on chance; an estimate subject to less upward bias is suggested. For estimating alternate-forms reliability, communality-based coefficients are less appropriate than stratified alpha.I thank Edward Haertel for comments and suggestions, and Andrew Comrey for data.  相似文献   

16.
This paper seeks to meet the need for a general treatment of the problem of error in classification. Within an m-attribute classificatory system, an object's typical subclass is that subclass to which it is most often allocated under repeated experimentally independent applications of the classificatory criteria. In these terms, an error of classification is an atypical subclass allocation. This leads to definition of probabilitiesO of occasional subclass membership, probabilitiesT of typical subclass membership, and probabilitiesE of error or, more generally, occasional subclass membership conditional upon typical subclass membership. In the relationshipf: (O, T, E) the relative incidence of independentO, T, andE values is such that generally one can specifyO values givenT andE, but one cannot generally specifyT andE values givenO. Under the restrictions of homogeneity ofE values for all members of a given typical subclass, mutual stochastic independence of errors of classification, and suitable conditions of replication, one can find particular systemsO =f(T, E) which are solvable forT andE givenO. A minimum of three replications of occasional classification is necessary for a solution of systems for marginal attributes, and a minimum of two replications is needed with any cross-classification. Although for such systems one can always specifyT andE values givenO values, the solution is unique for dichotomous systems only.With grateful acknowledgement to the Rockefeller Foundation; and to the United States Department of Health, Education, and Welfare, Public Health Service, for N. I. M. H. Grant M-3950.  相似文献   

17.
The purpose of this study was to evaluate a modified test of equivalence for conducting normative comparisons when distribution shapes are non‐normal and variances are unequal. A Monte Carlo study was used to compare the empirical Type I error rates and power of the proposed Schuirmann–Yuen test of equivalence, which utilizes trimmed means, with that of the previously recommended Schuirmann and Schuirmann–Welch tests of equivalence when the assumptions of normality and variance homogeneity are satisfied, as well as when they are not satisfied. The empirical Type I error rates of the Schuirmann–Yuen were much closer to the nominal α level than those of the Schuirmann or Schuirmann–Welch tests, and the power of the Schuirmann–Yuen was substantially greater than that of the Schuirmann or Schuirmann–Welch tests when distributions were skewed or outliers were present. The Schuirmann–Yuen test is recommended for assessing clinical significance with normative comparisons.  相似文献   

18.
Current models of word production assume that words are stored as linear sequences of phonemes which are structured into syllables only at the moment of production. This is because syllable structure is always recoverable from the sequence of phonemes. In contrast, we present theoretical and empirical evidence that syllable structure is lexically represented. Storing syllable structure would have the advantage of making representations more stable and resistant to damage. On the other hand, re-syllabifications affect only a minimal part of phonological representations and occur only in some languages and depending on speech register. Evidence for these claims comes from analyses of aphasic errors which not only respect phonotactic constraints, but also avoid transformations which move the syllabic structure of the word further away from the original structure, even when equating for segmental complexity. This is true across tasks, types of errors, and, crucially, types of patients. The same syllabic effects are shown by apraxic patients and by phonological patients who have more central difficulties in retrieving phonological representations. If syllable structure was only computed after phoneme retrieval, it would have no way to influence the errors of phonological patients. Our results have implications for psycholinguistic and computational models of language as well as for clinical and educational practices.  相似文献   

19.
Two experiments investigated the role of metacognition in changing answers to multiple-choice, general-knowledge questions. Both experiments revealed qualitatively different errors produced by speeded responding versus confusability amongst the alternatives; revision completely corrected the former, but had no effect on the latter. Experiment 2 also demonstrated that a pretest, designed to make participants' actual experience with answer changing either positive or negative, affected the tendency to correct errors. However, this effect was not apparent in the proportion of correct responses; it was only discovered when the metacognitive component to answer changing was isolated with a Type 2 signal-detection measure of discrimination. Overall, the results suggest that future research on answer changing should more closely consider the metacognitive factors underlying answer changing, using Type 2 signal-detection theory to isolate these aspects of performance.  相似文献   

20.
A distinction is drawn between the method of principal components developed by Hotelling and the common factor analysis discussed in psychological literature both from the point of view of stochastic models involved and problems of statistical inference. The appropriate statistical techniques are briefly reviewed in the first case and detailed in the second. A new method of analysis called the canonical factor analysis, explaining the correlations between rather than the variances of the measurements, is developed. This analysis furnishes one out of a number of possible solutions to the maximum likelihood equations of Lawley. It admits an iterative procedure for estimating the factor loadings and also for constructing the likelihood criterion useful in testing a specified hypothesis on the number of factors and in determining a lower confidence limit to the number of factors.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号