首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 93 毫秒
Previous designs for online calibration have only considered examinees’ responses to items. However, the use of response time, a useful metric that can easily be collected by a computer, has not yet been embedded in calibration designs. In this article we utilize response time to optimize the assignment of new items online, and accordingly propose two new adaptive designs. These are the D-optimal per expectation time unit design (D-ET) and the D-optimal per time unit design (D-T). The former method uses the conditional maximum likelihood estimation (CMLE) method to estimate the expected response times, while the latter employs the nonparametric k-nearest-neighbour method to predict the response times. Simulations were conducted to compare the two new designs with the D-optimal online calibration design (D design) in the context of continuous online calibration. In addition, a preliminary study was carried out to evaluate the performance of CMLE prior to its application in D-ET. The results showed that, compared to the D design, the D-ET and D-T designs saved response time and accrued larger calibration information per time unit, without sacrificing item calibration precision.  相似文献   

A theory is proposed in which beliefs in the form of internal cue validities mediate the processing of ecological cue validities in the assessment of confidence. The conditions necessary for perfect calibration are specified: (a) correspondence between ecological and internal validity, (b) perfect translation of internal validity into a confidence assessment, and (c) consistent utilization of cues. Process errors are then added to these conditions to investigate how calibration is affected by error variance of confidence assessments. To accomplish this, the calibration score (C) is decomposed into three additive parts: D2 = bias, i.e., the squared difference between mean confidence and proportion correct; R2 = resolution, i.e., the squared difference between the standard deviations of confidence and proportion correct; L = linearity, i.e., how closely the calibration curve follows a linear function. In the equation C = D2 + R2 + L, R2 (resolution) reflects the subject′s ability to discriminate cue validities. Selection of items is a critical factor in studies of confidence. Informal selection with a tendency to avoid easy items results in overconfidence. Internal cue theory predicts both that overconfidence should disappear (in accordance with previous research) and that resolution should improve when item selection is made representative of the natural environment. Both predictions are confirmed by data from published studies on confidence in general knowledge. It is noteworthy that resolution is still poor and accounts for the major portion of miscalibration under representative item selection.  相似文献   

A reliability study is assumed to be carried out with each of a number of observers making a dichotomous judgment concerning each of a sample of subjects. A nonparametric model is proposed for the errors underlying such judgments, and conditions are given under which Cochran'sQ statistic is valid for testing the hypothesis of no systematic differences among the judgments of the different observers. Inferences concerning the probabilities of error are shown to be possible in terms of the intraclass correlation coefficient. A numerical example is given.This research was supported in part by NIMH grant MH-03546. I am indebted to Dr. E. I. Burdock, Associate Research Scientist, Biometrics Research, for his valuable suggestions and criticisms.  相似文献   

A first order uncountably valued logicL Q(0,1) for management of uncertainty is considered. It is obtained from approximation logicsL T of any poset type (T, ) (see Rasiowa [17], [18], [19]) by assuming (T, )=(Q(0, 1), ) — whereQ(0, 1) is the set of all rational numbersq such that 0<q<1 and is the arithmetic ordering — by eliminating modal connectives and adopting a semantics based onLT-fuzzy sets (see Rasiowa and Cat Ho [20], [21]). LogicL Q(0,1) can be treated as an important case ofLT-fuzzy logics (introduced in Rasiowa and Cat Ho [21]) for (T, )=(Q(0, 1), ), i.e. asLQ(0, 1)-fuzzy logic announced in [21] but first examined in this paper.L Q(0,1) deals with vague concepts represented by predicate formulas and applies approximate truth-values being certain subsets ofQ(0, 1). The set of all approximate truth-values consists of the empty set ø and all non-empty subsetss ofQ(0, 1) such that ifqs andqq, thenqs. The setLQ(0, 1) of all approximate truth-values is uncountable and covers up to monomorphism the closed interval [0, 1] of the real line.LQ(0, 1) is a complete set lattice and therefore a pseudo-Boolean (Heyting) algebra. Equipped with some additional operations it is a basic plain semi-Post algebra of typeQ(0, 1) (see Rasiowa and Cat Ho [20]) and is taken as a truth-table forL Q(0,1) logic.L Q(0,1) can be considered as a modification of Zadeh's fuzzy logic (see Bellman and Zadeh [2] and Zadeh and Kacprzyk, eds. [29]). The aim of this paper is an axiomatization of logicL Q(0,1) and proofs of the completeness theorem and of the theorem on the existence ofLQ(0, 1)-models (i.e. models under the semantics introduced) for consistent theories based on any denumerable set of specific axioms. Proofs apply the theory of plain semi-Post algebras investigated in Cat Ho and Rasiowa [4].Presented byCecylia Rauszer  相似文献   


Attention is known to be sensitive to the temporal structure of scenes. We initially tested whether feature synchrony, an attribute with potential special status because of its association with objecthood, is something which draws attention. Search items were surrounded by colours which periodically changed either in synchrony or out-of synchrony with periodic changes in their shape. Search for a target was notably faster when the target location contained a unique synchronous feature change amongst asynchronous changes. However, the reverse situation produced no search advantage. A second experiment showed that this effect of unique synchrony was actually a consequence of the lower rate of perceived flicker in the synchronous compared to the asynchronous items, not the synchrony itself. In our displays it seems that attention is drawn towards a location which has a relatively low rate of change. Overall, the pattern of results suggested the attentional bias we find is for relative temporal stability. Results stand in contrast to other work which has found high and low flicker rates to both draw attention equally [Cass, J., Van der Burg, E., & Alais, D. (2011). Finding flicker: Critical differences in temporal frequency capture attention. Frontiers in Psychology, 2, 320]. Further work needs to determine the exact conditions under which this bias is and is not found when searching in complex dynamically-changing displays.  相似文献   

A logic is a pair (P,Q) where P is a set of formulas of a fixed propositional language and Q is a set of rules. A formula is deducible from X in the logic (P, Q) if it is deducible from XP via Q. A matrix is strongly adequate to (P, Q) if for any , X, is deducible from X iff for every valuation in , is designated whenever all the formulas in X are. It is proved in the present paper that if Q = {modus ponens, adjunction } and P {E, R, E +, R +, E I, R I } then there exists a matrix strongly adequate to (P, Q).  相似文献   

In three-mode Principal Components Analysis, theP ×Q ×R core matrixG can be transformed to simple structure before it is interpreted. It is well-known that, whenP=QR,G can be transformed to the identity matrix, which implies that all elements become equal to values specified a priori. In the present paper it is shown that, whenP=QR − 1,G can be transformed to have nearly all elements equal to values spectified a priori. A cllsed-form solution for this transformation is offered. Theoretical and practical implications of this simple structure transformation ofG are discussed. Constructive comments from anonymous reviewers are gratefully acknowledged.  相似文献   

It is often considered desirable to have the same ordering of the items by difficulty across different levels of the trait or ability. Such an ordering is an invariant item ordering (IIO). An IIO facilitates the interpretation of test results. For dichotomously scored items, earlier research surveyed the theory and methods of an invariant ordering in a nonparametric IRT context. Here the focus is on polytomously scored items, and both nonparametric and parametric IRT models are considered.The absence of the IIO property in twononparametric polytomous IRT models is discussed, and two nonparametric models are discussed that imply an IIO. A method is proposed that can be used to investigate whether empirical data imply an IIO. Furthermore, only twoparametric polytomous IRT models are found to imply an IIO. These are the rating scale model (Andrich, 1978) and a restricted rating scale version of the graded response model (Muraki, 1990). Well-known models, such as the partial credit model (Masters, 1982) and the graded response model (Samejima, 1969), do no imply an IIO.  相似文献   

Choice of the appropriate model in meta‐analysis is often treated as an empirical question which is answered by examining the amount of variability in the effect sizes. When all of the observed variability in the effect sizes can be accounted for based on sampling error alone, a set of effect sizes is said to be homogeneous and a fixed‐effects model is typically adopted. Whether a set of effect sizes is homogeneous or not is usually tested with the so‐called Q test. In this paper, a variety of alternative homogeneity tests – the likelihood ratio, Wald and score tests – are compared with the Q test in terms of their Type I error rate and power for four different effect size measures. Monte Carlo simulations show that the Q test kept the tightest control of the Type I error rate, although the results emphasize the importance of large sample sizes within the set of studies. The results also suggest under what conditions the power of the tests can be considered adequate.  相似文献   

Although pictures are often added to text in items of educational tests, little is known about their influence on item solving. Therefore, we conducted an experiment in which we examined how pictures affected item solving. A total of N = 158 fourth‐grade students completed a physics knowledge test under one of six experimental conditions. The experimental conditions varied according to whether or not pictures were presented in the stem and in the answer options of the test items. The results showed that pictures in the stem and in the answer options increased the correctness with which students responded to the test items. This was particularly true for test items that required the application of relationships. In addition, response time was reduced when pictures were added to the answer options of the test items. Hence, pictures are an important feature of test items that produce changes in item processing. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

The (univariate) isotonic psychometric (ISOP) model (Scheiblechner, 1995) is a nonparametric IRT model for dichotomous and polytomous (rating scale) psychological test data. A weak subject independence axiom W1 postulates that the subjects are ordered in the same way except for ties (i.e., similarly or isotonically) by all items of a psychological test. A weak item independence axiom W2 postulates that the order of the items is similar for all subjects. Local independence (LI or W3) is assumed in all models. With these axioms, sample-free unidimensional ordinal measurements of items and subjects become feasible. A cancellation axiom (Co) gives, as a result, the additive isotonic psychometric (ADISOP) model and interval scales for subjects and items, and an independence axiom (W4) gives the completely additive isotonic psychometric (CADISOP) model with an interval scale for the response variable (Scheiblechner, 1999). The d-ISOP, d-ADISOP, and d-CADISOP models are generalizations to d-dimensional dependent variables (e.g., speed and accuracy of response). The author would like to thank an Associate Editor and two anonymous referees and also Professor H.H. Schulze for their very valuable suggestions and corrections.  相似文献   

谭青蓉  汪大勋  罗芬  蔡艳  涂冬波 《心理学报》2021,53(11):1286-1300
项目增补(Item Replenishing)对认知诊断计算机自适应测验(CD-CAT)题库的维护有着至关重要的作用, 而在线标定是一种重要的项目增补方式。基于数据挖掘中特征选择(Feature Selection)的思路, 提出一种高效的基于熵的信息增益的在线标定方法(记为IGEOCM), 该方法利用被试在新旧题上的作答联合估计新题的Q矩阵和项目参数。研究采用Monte Carlo模拟实验验证所开发新方法的效果, 并同时与已有的在线标定方法SIE、SIE-R-BIC和RMSEA-N进行比较。结果表明:新开发的IGEOCM在各实验条件下均具有较好的项目标定精度和项目估计效率, 且整体上优于已有的SIE等方法; 同时, IGEOCM标定新题所需的时间低于SIE等方法。总之, 研究为CD-CAT题库中项目的增补提供了一种更为高效、准确的方法。  相似文献   

A definition ofessential independence is proposed for sequences of polytomous items. For items satisfying the reasonable assumption that the expected amount of credit awarded increases with examinee ability, we develop a theory ofessential unidimensionality which closely parallels that of Stout. Essentially unidimensional item sequences can be shown to have a unique (up to change-of-scale) dominant underlying trait, which can be consistently estimated by a monotone transformation of the sum of the item scores. In more general polytomous-response latent trait models (with or without ordered responses), anM-estimator based upon maximum likelihood may be shown to be consistent for under essentially unidimensional violations of local independence and a variety of monotonicity/identifiability conditions. A rigorous proof of this fact is given, and the standard error of the estimator is explored. These results suggest that ability estimation methods that rely on the summation form of the log likelihood under local independence should generally be robust under essential independence, but standard errors may vary greatly from what is usually expected, depending on the degree of departure from local independence. An index of departure from local independence is also proposed.This work was supported in part by Office of Naval Research Grant N00014-87-K-0277 and National Science Foundation Grant NSF-DMS-88-02556. The author is grateful to William F. Stout for many helpful comments, and to an anonymous reviewer for raising the questions addressed in section 2. A preliminary version of section 6 appeared in the author's Ph.D. thesis.  相似文献   

Topic of the paper is Q-logic – a logic of agency in its temporal and modal context. Q-logic may be considered as a basal logic of agency since the most important stit-operators discussed in the literature can be defined or axiomatized easily within its semantical and syntactical framework. Its basic agent dependent operator, the Q-operator (also known as - or cstit-operator), which has been discussed independently by F. v. Kutschera and B. F. Chellas, is investigated here in respect of its relation to other temporal and modal operators. The main result of the paper, then, is a completeness result for a calculus of Q-logic with respect to a semantics defined on the tree-approach to agency as introduced and developed by, among others, F. v. Kutschera and N. D. Belnap.  相似文献   

Objectives: Mini Mental State Examination’s (MMSE’s) sensitivity in its upper level is questioned, hence we investigated cognitive abnormalities and defects in regional cerebral blood flow (rCBF) in elderly with MMSE scores ≥24.

Methods: One hundred and four men at age 81 with MMSE scores ≥24 (mean 28.4 ± 1.7), no dementia or stroke, were examined with neuropsychological test battery, and their rCBF was estimated using 99mTc-HMPAO SPECT.

Results: MMSE was very sparsely correlated with rCBF. Instead, visuo-spatial tests were correlated with rCBF in parietal and occipital lobe, verbal tests with rCBF in frontal and temporal-parietal lobes, and most of all between Digit Symbol and all rCBF regions, especially in subcortical gray and white matter. In a cluster of low achievers, test of Synonyms, followed by Digit Symbol and Benton test, had highest discriminatory importance. Low achievers had generalized rCBF changes especially in subcortical areas. Only lower scores on two MMSE items, figure drawing and calculation, could discriminate the clusters.

Conclusion: A substantial number of octogenarian men with MMSE ≥ 24p have widespread rCBF changes corresponding to a decreased speeded performance and verbal capacity.  相似文献   

An attempt is made to include the axioms of Mackey for probabilities of experiments in quantum mechanics into the calculus x0 of ukasiewicz. The obtained calculusQ contains an additional modal signQ and four modal rules of inference. The propositionQx is read x is confirmed. The most specific rule of inference may be read: for comparable observations implication is equivalent to confirmation of material implication.The semantic truth ofQ is established by the interpretation with the help of physical objects obeying to the rules of quantum mechanics. The embedding of the usual quantum propositional logic inQ is accomplished.Allatum est die 9 Junii 1976  相似文献   

A system FDQ of first degree entailment with quantification, extending classical quantification logic Q by an entailment connective, is axiomatised, and the choice of axioms defended and also, from another viewpoint, criticised. The system proves to be the equivalent to the first degree part of the quantified entailmental system EQ studied by Anderson and Belnap; accordingly the semantics furnished are alternative to those provided for the first degree of EQ by Belnap. A worlds semantics for FDQ is presented, and the soundness and completeness of FDQ proved, the main work of the paper going into the proof of completeness. The adequacy result is applied to yield, as well as the usual corollaries, weak relevance of FDQ and the fact that FDQ is the common first degree of a wide variety of (constant domain) quantified relevant logics. Finally much unfinished business at the first degree is discussed.  相似文献   

ObjectivesSerial performance evaluations show calibration effects: Judges avoid extreme categories in the beginning (e.g. best or worst) because they need to calibrate an internal judgment scale (Unkelbach et al., 2012). Successful calibration is therefore important for fair and unbiased evaluations. A central prerequisite for successful calibration is knowledge about the performance range. The present study tests whether advance knowledge about the range (best and worst) of performances in a series reduces calibration effects.DesignA 2 × 2 × 2 design was developed with two between subject factors: the knowledge about the performance range (with vs. without) and two different talent tests (specific vs. unspecific). As within subject factor the position of the performances in the series (position 1–10 vs. 11–20) was integrated. The combination of the between subject factors resulted in four experimental conditions.MethodHandball coaches were randomly assigned to one of the conditions. Afterwards twenty performances were evaluated in a randomized order by the coaches.ResultsWithout knowledge about the range, they showed the expected avoidance of extreme categories in the beginning independent of the presented talent test. However, observing the best and worst performance in advance prevented the biases. Range-presentation is therefore a viable theory-based intervention to improve fairness in serial judgments.  相似文献   

Growth curve modeling is one of the main analytical approaches to study change over time. Growth curve models are commonly estimated in the linear and nonlinear mixed-effects modeling framework in which both the mean and person-specific curves are modeled parametrically with functions of time such as the linear, quadratic, and exponential. However, when more complex nonlinear trajectories need to be estimated and researchers do not have a priori knowledge of an appropriate functional form of growth, parametric models may be too restrictive. This paper reviews functional mixed-effects models, a nonparametric extension of mixed-effects models that permit both the mean and person-specific curves to be estimated without assuming a prespecified functional form of growth. Details of the model are presented along with results from a simulation study and an empirical example. The simulation study showed functional mixed-effects models performed reasonably well under various conditions commonly associated with longitudinal panel data, such as few time points per person, irregularly spaced time points across persons, missingness, and nonlinear trajectories. The usefulness of functional mixed-effects models is illustrated by analyzing empirical data from the Early Childhood Longitudinal Study – Kindergarten Class of 1998–1999.  相似文献   

In paper [5] it was shown that a great part of model theory of logic with the generalized quantifier Q x = there exist uncountably many x is reducible to the model theory of first order logic with an extra binary relation symbol. In this paper we consider when the quantifier Q x can be syntactically defined in a first order theory T. That problem was raised by Kosta Doen when he asked if the quantifier Q x can be eliminated in Peano arithmetic. We answer that question fully in this paper.I would like to thank Kosta Doen and Zoran Markovi who made valuable suggestions and remarks on a draft of this paper.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号