首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In a common psychological procedure, the subject is presented a sequence of items and is asked to recall them in order. His response is scored for items reported correctly in their correct position (position score) and for items reported correctly independently of position (item score). Such data are analyzed in terms of a model which assumes that a particular stimulus item may be forgotten entirely (State 0), may be remembered but without any knowledge of its position in the sequence, or may be remembered together with knowledge of its position (State 2). State 2 is related to the position score, and we define a nonexclusive State 1 (which contains all items not in State 0) that is related to the item score. In Part 1, we use the observed item and position scores to derive estimates of the trial-to-trial distribution of the number of items in States 1 and 2. In Part 2 we consider separately each serial position of the stimulus, and derive estimates of the probability that each individual item is in State 1 and State 2. The model handles omissions, second guesses, and gives sensitive estimates of partial information. Fast Fortran computer programs are available for all computations. In general, whenever responses are scored for items and/or for position, and when no alternative model is being tested, it is recommended that the above model be used to correct for the effects of guessing.  相似文献   

2.
The current study examined the relationship between test-taker cognition and psychometric item properties in multiple-selection multiple-choice and grid items. In a study with content-equivalent mathematics items in alternative item formats, adult participants’ tendency to respond to an item was affected by the presence of a grid and variations of answer options. The results of an item response theory analysis were consistent with the hypothesized cognitive processes in alternative item formats. The findings suggest that seemingly subtle variations of item design could substantially affect test-taker cognition and psychometric outcomes, emphasizing the need for investigating item format effects at a fine-grained level.  相似文献   

3.
4.
Background: Although there have been numerous studies conducted on the psychometric properties of Biggs' Learning Process Questionnaire (LPQ), these have involved the use of traditional omnibus measures of scale quality such as corrected item total correlations, internal consistency estimates of reliability, and factor analysis. However, these omnibus measures of scale quality are sample dependent and fail to model item responses as a function of trait level. And since the item trait relationship is typically nonlinear, traditional factor analytic methods are inappropriate. Aims: The purpose of this study was to identify a unidimensional subset of LPQ items and examine the effectiveness of these items and their options in discriminating between changes in the underlying trait level. In addition to assessing item quality, we were interested in assessing overall scale quality with non‐sample dependent measures. Method: The sample was split into two nearly equal halves, and a undimensional subset of items was identified in one of these samples and cross‐validated in the other. The nonlinear relationship between the probability of endorsing an item option and the underlying trait level was modelled using a nonparametric latent trait technique known as kernel smoothing and implemented with the program TestGraf. After item and scale quality were established, maximum likelihood estimates of participants' trait level were obtained and used to examine grade and gender differences. Results: A undimensional subset of 16 deep and achieving items was identified. Slightly more than half of these items needed some of their options combined so that the probability of endorsing an item option as a function of increasing trait level corresponded to the ideal rank ordering of the item options. With this adjustment, scale quality as measured by the information function and standard error function was found to be good. However, no statistically significant gender differences were observed and, although statistically significant grade differences were observed, they were not substantively meaningful. Conclusions: The use of nonparametric kernel‐smoothing techniques is advocated over parametric latent trait methods for the analysis of attitudinal and psychological measures involving polychotomous ordered‐response categories. It is also suggested that latent trait methods are more appropriate than traditional test‐based measures for studying differential item functioning both within and between cultures. Nonparametric kernel‐smoothing techniques hold particular promise in identifying and understanding cross‐cultural differences in student approaches to learning at both the item and scale level.  相似文献   

5.
There is a consensus that visual working memory (WM) resources are sharply limited, but debate persists regarding the simple question of whether there is a limit to the total number of items that can be stored concurrently. Zhang and Luck (2008) advanced this debate with an analytic procedure that provided strong evidence for random guessing responses, but their findings can also be described by models that deny guessing while asserting a high prevalence of low precision memories. Here, we used a whole report memory procedure in which subjects reported all items in each trial and indicated whether they were guessing with each response. Critically, this procedure allowed us to measure memory performance for all items in each trial. When subjects were asked to remember 6 items, the response error distributions for about 3 out of the 6 items were best fit by a parameter-free guessing model (i.e. a uniform distribution). In addition, subjects’ self-reports of guessing precisely tracked the guessing rate estimated with a mixture model. Control experiments determined that guessing behavior was not due to output interference, and that there was still a high prevalence of guessing when subjects were instructed not to guess. Our novel approach yielded evidence that guesses, not low-precision representations, best explain limitations in working memory. These guesses also corroborate a capacity-limited working memory system – we found evidence that subjects are able to report non-zero information for only 3–4 items. Thus, WM capacity is constrained by an item limit that precludes the storage of more than 3–4 individuated feature values.  相似文献   

6.
Understanding the nature of science (NOS) is a critical aspect of scientific reasoning, yet few studies have investigated its developmental beginnings and initial structure. One contributing reason is the lack of an adequate instrument. Two studies assessed NOS understanding among third graders using a multiple‐select (MS) paper‐and‐pencil test. Study 1 investigated the validity of the MS test by presenting the items to 68 third graders (9‐year‐olds) and subsequently interviewing them on their underlying NOS conception of the items. All items were significantly related between formats, indicating that the test was valid. Study 2 applied the same instrument to a larger sample of 243 third graders, and their performance was compared to a multiple‐choice (MC) version of the test. Although the MC format inflated the guessing probability, there was a significant relation between the two formats. In summary, the MS format was a valid method revealing third graders' NOS understanding, thereby representing an economical test instrument. A latent class analysis identified three groups of children with expertise in qualitatively different aspects of NOS, suggesting that there is not a single common starting point for the development of NOS understanding; instead, multiple developmental pathways may exist.  相似文献   

7.
Ambiguous response formats predict correlations from -.467 to -1 between opposite items, depending on whether the respondent's interpretation of the format is unipolar or bipolar. The authors present a procedure to investigate the proper interpretation in each case. It consists of applying nonparametric and parametric item response theory models (the Mokken and the graded response models) to pairs of opposite items in order to find the locations of the response options along the latent scale and, therefore, identify the response format construction. The authors tested this procedure on 4 samples (Ns=142-1,150) and 2 item pairs ("relaxed"-"tense" and "optimistic"-"pessimistic"). The results revealed that respondents constructed the formats as bipolar and supported the bipolarity of the item pairs.  相似文献   

8.
Classical item analysis procedures were developed for dichotomously scored items and do not apply to items allowing multiple correct responses. Maximum likelihood procedures analogous to those employed in polychotomous bio-assay are presented which yield estimates of the sets of parameters for items having multiple nonordered responses. Expressions for the estimates of the asymptotic variances of the item parameters and on overall chi-square goodness of fit test are also provided.  相似文献   

9.
Multiple‐choice (MC) tests are arguably the most widely used testing format in applied settings. In the psychometric and education literatures, research on the optimal number of options for knowledge and ability MC tests has revealed that three‐option tests are psychometrically equivalent and, in some cases, superior to five‐option tests. In addition, there are a number of practical, economic, and administrative advantages associated with the use of three‐option MC tests. Yet, despite its advantages, the three‐option format is underutilized in personnel selection. Across two studies, we compared test‐taker perceptions, criterion‐related validity, and sex‐based subgroup differences, and in Study 1, we compared race‐based subgroup differences on three‐ and five‐option tests. Participants in the two studies completed a three‐ or five‐option version of ACT. Test perceptions, criterion‐related validity, and race‐ and sex‐based subgroup differences were similar across test formats. The implications for the expanded use of three‐option tests in applied settings and future directions for research are discussed.  相似文献   

10.
Previous research has shown that multiple choice tests often improve memory retention. However, the presence of incorrect lures often attenuates this memory benefit. The current research examined the effects of “all of the above” (AOTA) options. When such options are correct, no incorrect lures are present. In the first three experiments, a correct AOTA option on an initial test led to a larger memory benefit than no test and standard multiple choice test conditions. The benefits of a correct AOTA option occurred even without feedback on the initial test; for both 5-minute and 48-hour retention delays; and for both cued recall and multiple choice final test formats. In the final experiment, an AOTA question led to better memory retention than did a control condition that had identical timing and exposure to response options. However, the benefits relative to this control condition were similar regardless of the type of multiple choice test (AOTA or not). Results suggest that retrieval contributes to multiple choice testing effects. However, the extra testing effect from a correct AOTA option, rather than being due to more retrieval, might be due simply to more exposure to correct information.  相似文献   

11.
A multivariate logistic latent trait model for items scored in two or more nominal categories is proposed. Statistical methods based on the model provide 1) estimation of two item parameters for each response alternative of each multiple choice item and 2) recovery of information from wrong responses when estimating latent ability. An application to a large sample of data for twenty vocabulary items shows excellent fit of the model according to a chi-square criterion. Item and test information curves are compared for estimation of ability assuming multiple category and dichotomous scoring of these items. Multiple scoring proves substantially more precise for subjects of less than median ability, and about equally precise for subjects above the median.Preparation of this paper was supported in part by N.S.F. Grant GS-2900.  相似文献   

12.
Preferences of 2 children with developmental disabilities, whose functional analyses indicated that their problem behavior was maintained by access to tangible items, were assessed using three formats (i.e., paired stimulus [PS], multiple‐stimulus without replacement [MSWO], and free operant [FO]). The experimenter administered each format five times and compared levels of problem behavior across formats in a multielement design. Both participants exhibited problem behavior in PS and MSWO formats but not in the FO format. Results are discussed in terms of recommendations for practitioners.  相似文献   

13.
Conventional wisdom and studies of unconscious processing suggest that sleeping on a choice may improve decision making. Although sleep has been shown to benefit several cognitive tasks, including problem solving, its impact on everyday choices remains unclear. Here we explore the effects of ‘sleeping on it’ on preference‐based decisions among multiple options. In two studies, individuals viewed several attributes describing a set of items and were asked to select their preferred item after a 12‐hour interval that either contained sleep or was spent fully awake. After an overnight period including sleep, individuals showed increases in positive perceptions of the choice set. This finding contrasts with previous research showing that sleep selectively enhances recall for negative information. In addition, this increase in positive recall did not translate into a greater desire to purchase their preferred item or into an overall benefit for choice satisfaction. Time‐of‐day controls were used to confirm that the observed effects could not be explained by circadian influences. Thus, we show that people may feel more positive about the choice options but not more confident about the choice after ‘sleeping on’ a subjective decision. We discuss how the valence of recalled choice set information may be important in understanding the effects of sleep on multi‐attribute decision making and suggest several avenues for future research. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

14.
Pseudo-guessing parameters are present in item response theory applications for many educational assessments. When sample size is not sufficiently large, the guessing parameters may be ignored from the analysis. This study examines the impact of ignoring pseudo-guessing parameters on measurement invariance analysis, specifically, on item difficulty, item discrimination, and mean and variance of ability distribution. Results show that when non-zero guessing parameters are ignored from the measurement invariance analysis, item discrimination estimates tend to decrease particularly for more difficult items, and item difficulty estimates decrease unless the items are highly discriminating and difficult. As the guessing parameter increases, the size of the decrease in item discrimination and difficulty tends to increase, and the estimated mean and variance of ability distribution tend to be inaccurate. When two groups have heterogeneous ability distributions, ignoring the guessing parameter affects the reference group and the focal group differently. Implications of result findings are discussed.  相似文献   

15.
On distractor-identification tests students mark as many distractors as possible on each test item. A grading scale is developed for this type testing. The scale is optimal in that it is the unique scale giving an unbiased estimate of the student's true score, i.e., the score that would result if no guessing occurred. If the test is administered as a usual multiple choice test and graded using the usual correction for guessing scale, the expected item score is the same as for the distractor-identification testing using the optimal grading scale. However, the variance of the item score is shown to be less for distractor-identification testing than for usual multiple choice testing under certain conditions.  相似文献   

16.
Likert-type scales are commonly used when assessing attitudes, personality characteristics, and other psychological variables. This study examined the effect of varying the number of response options on the same set of 28 attitudinal items. Participants answered items using either a 4-point scale (forced choice), a 5-point scale that included a “neither” mid-point, or a 4-point scale with an option of “no opinion” presented after the item. The questionnaire also included an item asking participants what they believe the midpoint in a scale indicated. As predicted, participants’ interpretations of the midpoint varied widely with the most common responses being: “no opinion,” “don't care,” “unsure,” “neutral,” “equal/both,” and “neither.” The quantitative results showed that participants’ levels of item endorsement varied based on the response options offered. For example, “neither” was chosen more often than “no opinion” on all of the items.  相似文献   

17.
We report three studies showing that in prospective multiple‐trial decisions people often select a mix of sure and risky options over pure bundles of either option. Such a preference is not ‘rational’ because a mixed option cannot be the EV‐maximizing choice. Experiment 1 confirmed a mixed‐option preference for gains but not for losses. Showing a graph of the multiple‐trial outcome distribution reduced but did not eliminate this effect, suggesting that it is not due purely to a failure to aggregate correctly over the multiple trials. Experiment 2 replicated the mixed option preference using a wider range of problems. Experiment 3 compared choices in the trinary choice conditions used in Experiments 1 and 2 with binary choices between pairs of the multiple‐trial sure, mixed, and risky options. In the binary choice condition the mixed option was no longer the modal choice, suggesting that the strong mixed option preference found in the trinary choice conditions is mainly due to a compromise effect. However, the binary choice probabilities did show violations of strong stochastic transitivity in a pattern that suggested a slight bias toward the mixed option. Copyright © 2006 John Wiley & Sons, Ltd.  相似文献   

18.
Items bundles     
An item bundle is a small group of multiple choice items that share a common reading passage or graph, or a small group of matching items that share distractors. Item bundles are easily identified by paging through a copy of a test. Bundled items may violate the latent conditional independence assumption of unidimensional item response theory (IRT), but such a violation would not typically suggest the existence of a new fundamental human ability to read one specific reading passage or to interpret one specific graph. It is important, therefore, to have theoretical concepts and empirical checks that distinguish between, on the one hand, anticipated violations of latent conditional independence within item bundles, and, on the other hand, violations that cannot be attributed to idiosyncratic features of test format and instead suggest departures from unidimensionalty. To this end, two theorems on unidimensional IRT are extended to describe observable item response distributions when there is conditional independencebetween but not necessarilywithin item bundles.The author is grateful to Ivo Molenaar and the referees for many helpful suggestions, and to D. Thayer for assistance with computing.  相似文献   

19.
The purpose of this study was to compare four measures of difficulty of mental multiplication items: percentage of pupillary dilation, latency of solution, number of correct responses, and judgment of item difficulty. Sixteen multiplication problems, classified into four levels of difficulty, were presented visually to 13 Ss, who verbalized their solutions to the problems. Analyses of variance and correlation coefficients were computed. It was concluded that all four measures of difficulty were useful but that judgment of difficulty and latency of solution were better measures of item difficulty than were the other two. A discussion of pupillary dilation and information processing is included.  相似文献   

20.
Two experiments were conducted on edited TV newscast sequences to clarify effects of film accompaniment on learning from heard news text. In Experiment 1,150 British subjects viewed a sequence with either film format throughout or alternating film and ‘talking head’ format between items. Those items that were presented by ‘talking heads’ in the mixed sequence were learned better with film format, in which the heard text was accompanied by appropriate moving pictures. However, no effect of uniform context was found on the remaining items. In Experiment 2, 91 German subjects viewed one of four versions of a bulletin, one with ‘talking head’, one with film throughout, the other two having complementary mixed-format patterns. Besides confirming a beneficial effect of film presentation over ‘talking head’ which accords with the findings of all studies of learning from material in the form of national network newscasts, the results showed an impairing effect of uniform visual format. This can explain ‘contradictory’ findings, notably with atypical test material.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号