首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Over the past thirty years, obtaining diagnostic information from examinees’ item responses has become an increasingly important feature of educational and psychological testing. The objective can be achieved by sequentially selecting multidimensional items to fit the class of latent traits being assessed, and therefore Multidimensional Computerized Adaptive Testing (MCAT) is one reasonable approach to such task. This study conducts a rigorous investigation on the relationships among four promising item selection methods: D-optimality, KL information index, continuous entropy, and mutual information. Some theoretical connections among the methods are demonstrated to show how information about the unknown vector θ can be gained from different perspectives. Two simulation studies were carried out to compare the performance of the four methods. The simulation results showed that mutual information not only improved the overall estimation accuracy but also yielded the smallest conditional mean squared error in most region of θ. In the end, the overlap rates were calculated to empirically show the similarity and difference among the four methods.  相似文献   

2.
Abstract

In intervention studies having multiple outcomes, researchers often use a series of univariate tests (e.g., ANOVAs) to assess group mean differences. Previous research found that this approach properly controls Type I error and generally provides greater power compared to MANOVA, especially under realistic effect size and correlation combinations. However, when group differences are assessed for a specific outcome, these procedures are strictly univariate and do not consider the outcome correlations, which may be problematic with missing outcome data. Linear mixed or multivariate multilevel models (MVMMs), implemented with maximum likelihood estimation, present an alternative analysis option where outcome correlations are taken into account when specific group mean differences are estimated. In this study, we use simulation methods to compare the performance of separate independent samples t tests estimated with ordinary least squares and analogous t tests from MVMMs to assess two-group mean differences with multiple outcomes under small sample and missingness conditions. Study results indicated that a MVMM implemented with restricted maximum likelihood estimation combined with the Kenward–Roger correction had the best performance. Therefore, for intervention studies with small N and normally distributed multivariate outcomes, the Kenward–Roger procedure is recommended over traditional methods and conventional MVMM analyses, particularly with incomplete data.  相似文献   

3.
Distribution-free tests of stochastic dominance for small samples   总被引:1,自引:0,他引:1  
One variable is said to “stochastically dominate” another if the probability of observations smaller than x is greater for one variable than the other, for all x. Inferring stochastic dominance from data samples is important for many applications of econometrics and experimental psychology, but little is known about the performance of existing inferential methods. Through simulation, we show that three of the most widely used inferential methods are inadequate for use in small samples of the size commonly encountered in many applications (up to 400 observations from each distribution). We develop two new inferential methods that perform very well in a limited, but practically important, case where the two variables are guaranteed not to be equal in distribution. We also show that extensions of these new methods, and an improved version of an existing method, perform quite well in the original, unlimited case.  相似文献   

4.
We describe methods for assessing all possible criteria (i.e., dependent variables) and subsets of criteria for regression models with a fixed set of predictors, x (where x is an n×1 vector of independent variables). Our methods build upon the geometry of regression coefficients (hereafter called regression weights) in n-dimensional space. For a full-rank predictor correlation matrix, R xx, of order n, and for regression models with constant R 2 (coefficient of determination), the OLS weight vectors for all possible criteria terminate on the surface of an n-dimensional ellipsoid. The population performance of alternate regression weights—such as equal weights, correlation weights, or rounded weights—can be modeled as a function of the Cartesian coordinates of the ellipsoid. These geometrical notions can be easily extended to assess the sampling performance of alternate regression weights in models with either fixed or random predictors and for models with any value of R 2. To illustrate these ideas, we describe algorithms and R (R Development Core Team, 2009) code for: (1) generating points that are uniformly distributed on the surface of an n-dimensional ellipsoid, (2) populating the set of regression (weight) vectors that define an elliptical arc in ℝ n , and (3) populating the set of regression vectors that have constant cosine with a target vector in ℝ n . Each algorithm is illustrated with real data. The examples demonstrate the usefulness of studying all possible criteria when evaluating alternate regression weights in regression models with a fixed set of predictors.  相似文献   

5.
This paper uses a simulation approach to investigate how different attribute weighting techniques affect the quality of decisions based on multiattribute value models. The weighting methods considered include equal weighting of all attributes, two methods for using judgments about the rank ordering of weights, and a method for using judgments about the ratios of weights. The question addressed is: How well does each method perform when based on judgments of attribute weights that are unbiased but subject to random error? To address this question, we employ simulation methods. The simulation results indicate that ratio weights were either better than rank order weights (when error in the ratio weights was small or moderate) or tied with them (when error was large). Both ratio weights and rank order weights were substantially superior to the equal weights method in all cases studied. Our findings suggest that it will usually be worth the extra time and effort required to assess ratio weights. In cases where the extra time or effort required is too great, rank order weights will usually give a good approximation to the true weights. Comparisons of the two rank-order weighting methods favored the rank-order-centroid method over the rank-sum method. © 1998 John Wiley & Sons, Ltd.  相似文献   

6.
Relative Importance Analysis: A Useful Supplement to Regression Analysis   总被引:1,自引:0,他引:1  
This article advocates for the wider use of relative importance indices as a supplement to multiple regression analyses. The goal of such analyses is to partition explained variance among multiple predictors to better understand the role played by each predictor in a regression equation. Unfortunately, when predictors are correlated, typically relied upon metrics are flawed indicators of variable importance. To that end, we highlight the key benefits of two relative importance analyses, dominance analysis and relative weight analysis, over estimates produced by multiple regression analysis. We also describe numerous situations where relative importance weights should be used, while simultaneously cautioning readers about the limitations and misconceptions regarding the use of these weights. Finally, we present step-by-step recommendations for researchers interested in incorporating these analyses in their own work and point them to available web resources to assist them in producing these weights.  相似文献   

7.
In a multiple regression analysis with three or more predictors, every set of alternate weights belongs to an infinite class of “fungible weights” (Waller, Psychometrica, in press) that yields identical SSE (sum of squared errors) and R 2 values. When the R 2 using the alternate weights is a fixed value, fungible weights (a i ) that yield the maximum or minimum cosine with an OLS weight vector (b) are called “fungible extrema.” We describe two methods for locating fungible extrema and we report R code (R Development Core Team, 2007) for one of the methods. We then describe a new approach for populating a class of fungible weights that is derived from the geometry of alternate regression weights. Finally, we illustrate how fungible weights can be profitably used to gauge parameter sensitivity in linear models by locating the fungible extrema of a regression model of executive compensation (Horton & Guerard, Commun. Stat. Simul. Comput. 14:441–448, 1985).  相似文献   

8.
To assign an overall performance rating to a target, a rater must weight and combine various pieces of specific performance information about that target. Policy‐capturing research has demonstrated that individual differences in raters can influence the way raters combine specific performance information. The current study examined information processing from a different perspective by exploring the possibility that target differences may also influence the way raters weight and combine performance information. Raters (N = 146) rated each of six targets on six specific performance dimensions and on overall performance. Sequential moderation analyses indicated that targets influenced the way raters, as a group, combined information across targets. These results lend support to the inference that overall performance ratings may not be comparable across targets, that is, they may not reflect the same underlying performance across targets.  相似文献   

9.
Onset dominance in sound localization was examined by estimating observer weighting of interaural delays for each click of a train of high-frequency filtered clicks. The interaural delay of each click was a normal deviate that was sampled independently on each trial of a single-interval design. In Experiment 1, observer weights were derived for trains ofn=2, 4, 8, or 16 clicks as a function of interclick interval (ICI=1.8, 3.0, or 12.0 msec). For smalln and short ICI (1.8 msec), the ratio of onset weight to remaining weights was as large as 10. As ICI increased, the relative onset weight was reduced. For largen and all ICIs, the ongoing train was weighted more heavily than the onset. This diminishing relative onset weight with increasing ICI andn is consistent with optimum distribution of weights among components. Efficiency of weight distribution is near ideal when ICI=12 msec andn=2 and very poor for shorter ICIs and larger ns. Further experiments showed that: (1) onset dominance involves both within- and between-frequency-channel mechanisms, and (2) the stimulus configuration (ICI,n, frequency content, and temporal gaps) affects weighting functions in a complex way not explained by cross-correlation analysis or contralateral inhibition (Lindemann, 1986a, 1986b).  相似文献   

10.
A visual cognitive behavioral framework developed, with indirect literature support, from the author's practice depicts a polarized conflict between a rational side and an irrational side, named Igor. Each side's characteristics, reasoning, and methods of dealing with thoughts, feelings, and behaviors in panic/phobic disorders are described. Techniques of a group therapy approach include members sharing symptoms, fears, and escape behaviors; the therapist providing corrective information and role playing of each side; and therapist/members monitoring and supporting behavioral homework. Outcome measures appear to support desired simplicity, palatability, efficiency, and effectiveness.  相似文献   

11.
Although growing up in stressful conditions can undermine mental abilities, people in harsh environments may develop intact, or even enhanced, social and cognitive abilities for solving problems in high‐adversity contexts (i.e. ‘hidden talents’). We examine whether childhood and current exposure to violence are associated with memory (number of learning rounds needed to memorize relations between items) and reasoning performance (accuracy in deducing a novel relation) on transitive inference tasks involving both violence‐relevant and violence‐neutral social information (social dominance vs. chronological age). We hypothesized that individuals who had more exposure to violence would perform better than individuals with less exposure on the social dominance task. We tested this hypothesis in a preregistered study in 100 Dutch college students and 99 Dutch community participants. We found that more exposure to violence was associated with lower overall memory performance, but not with reasoning performance. However, the main effects of current (but not childhood) exposure to violence on memory were qualified by significant interaction effects. More current exposure to neighborhood violence was associated with worse memory for age relations, but not with memory for dominance relations. By contrast, more current personal involvement in violence was associated with better memory for dominance relations, but not with memory for age relations. These results suggest incomplete transfer of learning and memory abilities across contents. This pattern of results, which supports a combination of deficits and ‘hidden talents,’ is striking in relation to the broader developmental literature, which has nearly exclusively reported deficits in people from harsh conditions. A video abstract of this article can be viewed at: https://youtu.be/e4ePmSzZsuc .  相似文献   

12.
We investigate methods developed in multiple criteria decision‐making that use ordinal information to estimate numerical values. Such methods can be used to estimate attribute weights, attribute values, or event probabilities given ranks or partial ranks. We first review related studies and then develop a generalized rank‐sum (GRS) approach in which we provide a derivation of the rank‐sum approach that had been previously proposed. The GRS approach allows for incorporating the concept of degree of importance (or, difference in likelihood with respect to probabilities and difference in value for attribute values), information that most other rank‐based formulas do not utilize. We then present simulation results comparing the GRS method with other rank‐based formulas such as the rank order centroid method and comparing the GRS methods using as many as three levels of importance (i.e., GRS‐3) with Simos' procedure (which can also incorporate degree of importance). To our surprise, our results show that the incorporation of additional information (i.e., the degree of the importance), both GRS‐3 and Simos' procedure, did not result in better performance than rank order centroid or GRS. Further research is needed to investigate the modelling of such extra information. We also explore the scenario when a decision‐maker has indifference judgments and cannot provide a complete rank order. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

13.
Stern  Reuben 《Synthese》2019,198(27):6505-6527

Though common sense says that causes must temporally precede their effects, the hugely influential interventionist account of causation makes no reference to temporal precedence. Does common sense lead us astray? In this paper, I evaluate the power of the commonsense assumption from within the interventionist approach to causal modeling. I first argue that if causes temporally precede their effects, then one need not consider the outcomes of interventions in order to infer causal relevance, and that one can instead use temporal and probabilistic information to infer exactly when X is causally relevant to Y in each of the senses captured by Woodward’s interventionist treatment. Then, I consider the upshot of these findings for causal decision theory, and argue that the commonsense assumption is especially powerful when an agent seeks to determine whether so-called “dominance reasoning” is applicable.

  相似文献   

14.
This paper addresses the problem of opaque sweetening and argues that one should use stochastic dominance in comparing lotteries even when dealing with incomplete orderings that allow for non-comparable outcomes.  相似文献   

15.
16.
Many current computational models of object categorization either include no explicit provisions for dealing with incomplete stimulus information (e.g. Kruschke, Psychological Review 99:22–44, 1992) or take approaches that are at odds with evidence from other fields (e.g. Verguts, Ameel, & Storms, Memory & Cognition 32:379–389, 2004). In two experiments centered around the inverse base-rate effect, we demonstrate that people not only make highly informed inferences about the values of unknown features, but also subsequently use the inferred values to come to a categorization decision. The inferences appear to be based on immediately available information about the particular stimulus under consideration, as well as on higher-level inferences about the stimulus class as a whole. Implications for future modeling efforts are discussed.  相似文献   

17.
Measurement is a process aimed at acquiring and codifying information about properties of empirical entities. In this paper we provide an interpretation of such a process comparing it with what is nowadays considered the standard measurement theory, i.e., representational theory of measurement. It is maintained here that this theory has its own merits but it is incomplete and too abstract, its main weakness being the scant attention reserved to the empirical side of measurement, i.e., to measurement systems and to the ways in which the interactions of such systems with the entities under measurement provide a structure to an empirical domain. In particular it is claimed that (1) it is on the ground of the interaction with a measurement system that a partition can be induced on the domain of entities under measurement and that relations among such entities can be established, and that (2) it is the usage of measurement systems that guarantees a degree of objectivity and intersubjectivity to measurement results. As modeled in this paper, measurement systems link the abstract theory of measuring, as developed in representational terms, and the practice of measuring, as coded in standard documents such as the International Vocabulary of Metrology.  相似文献   

18.
Given a finite set A of actions evaluated by a set of attributes, preferential information is considered in the form of a pairwise comparison table including pairs of actions from subset BA described by stochastic dominance relations on particular attributes and a total order on the decision attribute. Using a rough sets approach for the analysis of the subset of preference relations, a set of decision rules is obtained, and these are applied to a set A\B of potential actions. The rough sets approach of looking for the reduction of the set of attributes gives us the possibility of operating on a multi‐attribute stochastic dominance for a reduced number of attributes. Copyright © 1999 John Wiley & Sons, Ltd.  相似文献   

19.
In applications of item response theory, assessment of model fit is a critical issue. Recently, limited‐information goodness‐of‐fit testing has received increased attention in the psychometrics literature. In contrast to full‐information test statistics such as Pearson’s X2 or the likelihood ratio G2, these limited‐information tests utilize lower‐order marginal tables rather than the full contingency table. A notable example is Maydeu‐Olivares and colleagues’M2 family of statistics based on univariate and bivariate margins. When the contingency table is sparse, tests based on M2 retain better Type I error rate control than the full‐information tests and can be more powerful. While in principle the M2 statistic can be extended to test hierarchical multidimensional item factor models (e.g., bifactor and testlet models), the computation is non‐trivial. To obtain M2, a researcher often has to obtain (many thousands of) marginal probabilities, derivatives, and weights. Each of these must be approximated with high‐dimensional numerical integration. We propose a dimension reduction method that can take advantage of the hierarchical factor structure so that the integrals can be approximated far more efficiently. We also propose a new test statistic that can be substantially better calibrated and more powerful than the original M2 statistic when the test is long and the items are polytomous. We use simulations to demonstrate the performance of our new methods and illustrate their effectiveness with applications to real data.  相似文献   

20.
Information about others' success in remembering is frequently available. For example, students taking an exam may assess its difficulty by monitoring when others turn in their exams. In two experiments, we investigated how rememberers use this information to guide recall. Participants studied paired associates, some semantically related (and thus easier to retrieve) and some unrelated (and thus harder). During a subsequent cued recall test, participants viewed fictive information about an opponent's accuracy on each item. In Experiment 1, participants responded to each cue once before seeing the opponent's performance and once afterwards. Participants reconsidered their responses least often when the opponent's accuracy matched the item difficulty (easy items the opponent recalled, hard items the opponent forgot) and most often when the opponent's accuracy and the item difficulty mismatched. When participants responded only after seeing the opponent's performance (Experiment 2), the same mismatch conditions that led to reconsideration even produced superior recall. These results suggest that rememberers monitor whether others' knowledge states accord or conflict with their own experience, and that this information shifts how they interrogate their memory and what they recall.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号