首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A procedure for point and interval estimation of maximal reliability of multiple‐component measuring instruments in multi‐level settings is outlined. The approach is applicable to hierarchical designs in which individuals are nested within higher‐order units and exhibit possibly related performance on components of a given homogeneous scale. The method is developed within the framework of multi‐level factor analysis. The proposed procedure is illustrated with an empirical example.  相似文献   

2.
A method for examining change in maximal reliability for pre‐specified sets of congeneric measures when developing a multi‐component instrument is outlined. The approach is applicable for purposes of estimation and testing of gain or loss in the maximal reliability coefficient as a consequence of adding or dropping one or more measures from a homogeneous composite with uncorrelated errors, as well as when one is concerned with optimal component choice for highest increase or correspondingly smallest drop in maximal reliability. The method is compared with a procedure for ascertaining change in unweighted sum score reliability, and implications for instrument construction and revision are discussed. The approach is illustrated with a numerical example.  相似文献   

3.
A covariance structure analysis method for improved point and interval estimation of composite reliability in repeated measure designs is outlined that accounts for specificity variance. The approach also permits the testing of time‐invariance in reliability of multiple‐component instruments in terms of the ratio of ‘pure’ measurement error variance to observed scale score variance. In addition, the procedure allows interval estimation of the difference in composite reliability coefficients across assessment occasions. The method described is illustrated with data from a cognitive intervention study.  相似文献   

4.
A method for examining invariance in maximal reliability for weighted combinations of congeneric measures is described. The approach is developed within the framework of covariance structure modelling and allows one to ascertain whether a multi‐component instrument consisting of homogeneous measures is associated with the same minimal relative error variance in distinct populations or over time. The procedure yields as a by‐product an interval measure of discrepancy in maximal reliability across independent groups or assessment occasions, and is illustrated with two examples.  相似文献   

5.
A. Hockey  G. Geffen   《Intelligence》2004,32(6):625
To determine whether the visuospatial n-back working memory task is a reliable and valid measure of cognitive processes believed to underlie intelligence, this study compared the reaction times and accuracy of performance of 70 participants, with performance on the Multidimensional Aptitude Battery (MAB). Testing was conducted over two sessions separated by 1 week. Participants completed the MAB during the second test session. Moderate test–retest reliability for percentage accuracy scores was found across the four levels of the n-back task, whilst reaction times were highly reliable. Furthermore, participants' performance on the MAB was negatively correlated with accuracy of performance at the easier levels of the n-back task and positively correlated with accuracy of performance at the harder task levels. These findings confirm previous research examining the cognitive basis of intelligence, and suggest that intelligence is the product of faster speed of information processing, as well as superior working memory capacity.  相似文献   

6.
This paper analyzes existing research on the test–retest reliability of human judgment, i.e. the extent to which a judge makes identical judgments when presented with identical stimuli on two occasions. Only research involving professional judges who make experimental judgments in a reasonable analog of their everyday experience is included. Studies of both internal consistency reliability and temporal stability reliability are analyzed (where the former refers to the inclusion of repeat stimuli in the same experimental session, and the latter refers to the repeating of the experimental task from a few days to several months later). It is found that (1) the test–retest reliability literature is concentrated in four substantive judgment areas (medicine/psychology, meteorology, human resources management, and business), (2) the literature is extremely variable in terms of research approach/design, the determinants or correlates of test–retest reliability that have been studied, and the quality of the execution and analysis, and (3) mean test–retest reliability differs across both substantive judgment areas and the internal consistency versus temporal stability distinction. An inescapable conclusion from the analysis is that our knowledge of this fundamental property of human judgment is quite meager. Therefore, the paper concludes with suggestions about future research that would address test–retest reliability more systematically. Copyright © 2000 John Wiley & Sons, Ltd.  相似文献   

7.
Two recommendations are offered for reporting multiple‐baseline designs across participants. These recommendations will better enable readers to (a) distinguish concurrent from nonconcurrent multiple‐baseline designs and (b) determine the temporal order in which sessions were conducted in concurrent multiple‐baseline designs. Copyright © 2005 John Wiley & Sons, Ltd.  相似文献   

8.
Hammar, Å., Sørensen, L., Årdal, G., Oedegaard, K.J., Kroken, R., Roness, A. & Lund, A. (2009). Enduring cognitive dysfunction in unipolar major depression: A test–retest study using the Stroop‐paradigm. Scandinavian Journal of Psychology. The aim of the study was to investigate automatic and effortful information processing with the Stroop paradigm in a long term perspective in patients with major depressive disorder (MDD). Patients were tested at two test occasions: at inclusion with a Hamilton Depression Rating Scale (HDRS) score >18, and after 6 months, when most patients had experienced symptom reduction. The Stroop paradigm is considered to measure aspects of attention and executive functioning and consists of three conditions/cards: naming the color of the patches (Color), reading of the color‐words (Word) and naming the ink color of color‐words (Color‐Word). The Color‐Word condition is proved to be the most cognitive demanding task and requires the proband to actively suppress interference and is therefore considered to require more effortful information processing, whereas naming the color of the patches and reading the color‐words are expected to be more automatic and less cognitive demanding. A homogenous group of 19 patients with unipolar recurrent MDD according to DSM‐IV and a HDRS score of >18 were included in the study. A control group was individually matched for age, gender and level of education. Depressed patients performed equal to the control group on the Color and Word cards at both test occasions. However, the patients were impaired compared with the control group on the Color‐Word card task at both test occasions. Thus, the depressed patients showed no improvement of effortful attention/executive performance as a function of symptom reduction. The results indicate that the depressed patients showed impaired cognitive performance on cognitive demanding tasks when symptomatic and that this impairment prevailed after 6 months, despite significant improvement in their depressive symptoms.  相似文献   

9.
The purpose of the study was to investigate with what accuracy the soleus H-reflex modulation and excitability could be measured during human walking on two occasions separated by days. The maximal M-wave (Mmax) was measured at rest in the standing position. During treadmill walking every stimulus elicited an M-wave of 25 ± 10% of Mmax in the soleus muscle and a supra-maximal stimulus elicited a maximal M-wave 60 ms after the first stimulus. Both Mmax during rest and during walking were later used for normalization. When normalized to resting Mmax, the peak reflex amplitude during walking was 5% lower on Day 2 than on Day 1 (p = .32). However, when the peak H-reflex was normalized to Mmax in every sweep, Day 2 showed a significant 15% lower amplitude (p = .037). The same pattern was found for the mean H-reflex. Spearman’s Rho was .92 when normalized to resting Mmax but .88 when normalized to Mmax in every sweep. The Pearson product was used to identify one participant at a time on Day 1 among all seven participants on Day 2. For both normalization procedures 5 of 7 participants were identified by this test. Since 5 of 7 participants were recognized between days, it must be recommended to use 10-15 participants for training or intervention studies as far as the H-reflex pattern of modulation during movement is concerned.  相似文献   

10.
A confidence interval construction procedure for the proportion of explained variance by a hierarchical, general factor in a multi‐component measuring instrument is outlined. The method provides point and interval estimates for the proportion of total scale score variance that is accounted for by the general factor, which could be viewed as common to all components. The approach may also be used for testing composite (one‐tailed) or simple hypotheses about this proportion, and is illustrated with a pair of examples.  相似文献   

11.
We developed masked visual analysis (MVA) as a structured complement to traditional visual analysis. The purpose of the present investigation was to compare the effects of computer‐simulated MVA of a four‐case multiple‐baseline (MB) design in which the phase lengths are determined by an ongoing visual analysis (i.e., response‐guided) versus those in which the phase lengths are established a priori (i.e., fixed criteria). We observed an acceptably low probability (less than .05) of false detection of treatment effects. The probability of correctly detecting a true effect frequently exceeded .80 and was higher when: (a) the masked visual analyst extended phases based on an ongoing visual analysis, (b) the effects were larger, (c) the effects were more immediate and abrupt, and (d) the effects of random and extraneous error factors were simpler. Our findings indicate that MVA is a valuable combined methodological and data‐analysis tool for single‐case intervention researchers.  相似文献   

12.
Peterson, Deary, and Austin (2003) considered the reliability of the Cognitive Styles Analysis (CSA) (Riding, 1991). The CSA seeks to assess an individual’s position on each of two fundamental style dimensions – the Wholist-Analytic and the Verbal-Imagery dimensions. It presents a series of simple cognitive tasks, which the subjects may choose to process according to their preferred style. Performance on these test items is in terms of response times. The CSA comprises 40 items to assess the Wholist-Analytic and 48 for the Verbal-Imagery and typically takes 15–20 min to complete. It is intended to be suitable for a wide age and ability range, and applicable to a variety of contexts and cultures.The most important characteristic of any test of cognitive style is its temporal stability. Studies which attempt to establish test validity without definitive evidence of test reliability are lacking a basic foundation. Riding has not published any statistical data on the test–retest reliability of the CSA.Peterson et al. (2003) and Peterson (2003) claim to have carried out the primary evaluation of the CSA’s reliability. However we were the first to publish accurate test–retest reliability data on Riding’s CSA (Redmond, Mullally, & Parkinson, 2002).This brief report addresses the issue as to who initially established the unreliability of the CSA in the first place and why Peterson, Deary and Austin’s claims are misleading and unsubstantiated.  相似文献   

13.
The social relations model (SRM) is a useful tool for measuring relationship effects, defined as the unique perceptions or behaviors of 2 people. The sources of variance in SRM studies are persons (actors and partners), groups, and items; the relationship effect is defined as the actor–partner interaction. By removing variance because of persons and groups, a measure of a “pure” relationship effect is obtained. In this article, generalizability theory (G Theory) is applied to estimate the reliability of SRM components from round‐robin data structures. Using G Theory, reliability formulas for actor, partner, group, and relationship are developed and interpretations for the reliability estimates are provided. The authors also discuss how these formulas can be used in both planning and interpreting results from relationship research.  相似文献   

14.
This study examines critical aspects of both the ecological and the person‐oriented accounts of observed biases in confidence judgements on tests of cognitive abilities. These biases reflect metacognitive processes involved in test‐taking. According to the ecological approach, poor realism of confidence judgements is due to the nature of the items included in general knowledge tests (test‐driven biases). The person‐oriented approach, however, argues that biases in confidence judgements may be due to a general self‐monitoring trait. The present study employed the ‘de‐biasing’ procedure proposed by Juslin ( 1994 ) for the selection of general knowledge test items, and used a newly developed geographical knowledge test suitable for the Australian population. Two other cognitive tests (Raven's Progressive Matrices and Line Length) were administered in order to determine whether there is a consistency in confidence ratings across diverse tasks. Statistical procedures traditional to both approaches‐calibration curves and factor analysis ‐ were employed. The results, with minor qualifications, support both perspectives. The study found a separate confidence factor, indicative of a self‐monitoring trait. Two other potential metacognitive factors (i.e. ‘expectation’ and ‘evaluation’, corresponding to self‐assessment/planning and self‐evaluation) could not be separated from accuracy and speed measures. Copyright © 2001 John Wiley & Sons, Ltd.  相似文献   

15.
The current study developed and tested a multiple‐stimulus‐without‐replacement (MSWO) assessment for potential sexual partners for use in research on human immunodeficiency virus. College students (N = 41) first completed an MSWO assessment and then completed a hypothetical purchase task for encounters with partners identified by the MSWO as high, median, and low preference. Overall, hypothetical purchase task responding was consistent with that from the MSWO, in that the highest valuation was observed for the high‐preference partner and the lowest for the low‐preference partner. Potentially interesting individual differences in purchase task responding, however, were obtained; some subjects showed differentiated responding among the 3 preference levels (n = 15), whereas others similarly valued high‐ and median‐preference partners (n = 5), and others similarly valued low‐ and median‐preference partners (n = 18).  相似文献   

16.
Despite its status as a prominent set of theories for explaining the elicitation and differentiation of emotions, much appraisal theory and research offer little indication of the nature of the relationship expected between appraisals and emotions. Here, we present a three‐study, multiple‐method analysis in which we examine numerous ways of testing appraisal–emotion relationships using the “prosocial” intergroup emotions—sympathy, anger, and guilt—as an example. Results show that the set of appraisal dimensions that appears strongly characteristic of an emotion varies depending on the kind of appraisal—emotion relationship hypothesised and the experimental methodology/statistical analysis used. These findings demonstrate the utility of explicit theorising about the nature of the relationship between emotions and appraisals, and show how the hypothesised appraisal–emotion relationship and choice of methodology can affect the structure of appraisal theories. We recommend an analysis across multiple methods to provide a more complete picture of a given set of appraisal–emotion relationships.  相似文献   

17.
This study compared 2 methods of fading prompts while teaching tacts to 3 individuals who had been diagnosed with autism spectrum disorder (ASD). The 1st method involved use of an echoic prompt and prompt fading. The 2nd method involved providing multiple‐alternative answers and fading by increasing the difficulty of the discrimination. An adapted alternating‐treatments design showed that both procedures were more effective than a no‐intervention control condition. Providing multiple alternatives did not increase error rates or teaching time, and better maintenance was shown for tacts taught with the multiple‐alternative prompt.  相似文献   

18.
We report three studies showing that in prospective multiple‐trial decisions people often select a mix of sure and risky options over pure bundles of either option. Such a preference is not ‘rational’ because a mixed option cannot be the EV‐maximizing choice. Experiment 1 confirmed a mixed‐option preference for gains but not for losses. Showing a graph of the multiple‐trial outcome distribution reduced but did not eliminate this effect, suggesting that it is not due purely to a failure to aggregate correctly over the multiple trials. Experiment 2 replicated the mixed option preference using a wider range of problems. Experiment 3 compared choices in the trinary choice conditions used in Experiments 1 and 2 with binary choices between pairs of the multiple‐trial sure, mixed, and risky options. In the binary choice condition the mixed option was no longer the modal choice, suggesting that the strong mixed option preference found in the trinary choice conditions is mainly due to a compromise effect. However, the binary choice probabilities did show violations of strong stochastic transitivity in a pattern that suggested a slight bias toward the mixed option. Copyright © 2006 John Wiley & Sons, Ltd.  相似文献   

19.
Almost all previous studies examining the benefits of testing for promoting student learning have used fixed schedules of practice. However, students more often report utilizing a dropout schedule of practice, in which items are dropped from practice once they are known. Two experiments investigated the costs and benefits of utilizing a dropout schedule of test–restudy practice. Participants learned Swahili–English paired associates using a dropout schedule or a fixed schedule. In the dropout schedule, items received test–restudy practice until each item was correctly recalled once. In the fixed schedule, all items received three tests–restudy practice trials regardless of whether they were correctly recalled, as in previous research. Experiment 2 also included a second learning session. In both experiments, a final cued recall test was administered several days later. Results indicated that the benefits of the dropout schedule (fewer practice trials used overall and all items correctly recalled once during practice) need to be considered in light of the costs (lower levels of final test performance). Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

20.
Inference methods for null hypotheses formulated in terms of distribution functions in general non‐parametric factorial designs are studied. The methods can be applied to continuous, ordinal or even ordered categorical data in a unified way, and are based only on ranks. In this set‐up Wald‐type statistics and ANOVA‐type statistics are the current state of the art. The first method is asymptotically exact but a rather liberal statistical testing procedure for small to moderate sample size, while the latter is only an approximation which does not possess the correct asymptotic α level under the null. To bridge these gaps, a novel permutation approach is proposed which can be seen as a flexible generalization of the Kruskal–Wallis test to all kinds of factorial designs with independent observations. It is proven that the permutation principle is asymptotically correct while keeping its finite exactness property when data are exchangeable. The results of extensive simulation studies foster these theoretical findings. A real data set exemplifies its applicability.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号