首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 78 毫秒
1.
The Type I error probability and the power of the independent samples t test, performed directly on the ranks of scores in combined samples in place of the original scores, are known to be the same as those of the non‐parametric Wilcoxon–Mann–Whitney (WMW) test. In the present study, simulations revealed that these probabilities remain essentially unchanged when the number of ranks is reduced by assigning the same rank to multiple ordered scores. For example, if 200 ranks are reduced to as few as 20, or 10, or 5 ranks by replacing sequences of consecutive ranks by a single number, the Type I error probability and power stay about the same. Significance tests performed on these modular ranks consistently reproduce familiar findings about the comparative power of the t test and the WMW tests for normal and various non‐normal distributions. Similar results are obtained for modular ranks used in comparing the one‐sample t test and the Wilcoxon signed ranks test.  相似文献   

2.
The purpose of this study was to evaluate a modified test of equivalence for conducting normative comparisons when distribution shapes are non‐normal and variances are unequal. A Monte Carlo study was used to compare the empirical Type I error rates and power of the proposed Schuirmann–Yuen test of equivalence, which utilizes trimmed means, with that of the previously recommended Schuirmann and Schuirmann–Welch tests of equivalence when the assumptions of normality and variance homogeneity are satisfied, as well as when they are not satisfied. The empirical Type I error rates of the Schuirmann–Yuen were much closer to the nominal α level than those of the Schuirmann or Schuirmann–Welch tests, and the power of the Schuirmann–Yuen was substantially greater than that of the Schuirmann or Schuirmann–Welch tests when distributions were skewed or outliers were present. The Schuirmann–Yuen test is recommended for assessing clinical significance with normative comparisons.  相似文献   

3.
The extent to which rank transformations result in the same statistical decisions as their non‐parametric counterparts is investigated. Simulations are presented using the Wilcoxon–Mann–Whitney test, the Wilcoxon signed‐rank test and the Kruskal–Wallis test, together with the rank transformations and t and F tests corresponding to each of those non‐parametric methods. In addition to Type I errors and power over all simulations, the study examines the consistency of the outcomes of the two methods on each individual sample. The results show how acceptance or rejection of the null hypothesis and differences in p‐values of the test statistics depend in a regular and predictable way on sample size, significance level, and differences between means, for normal and various non‐normal distributions.  相似文献   

4.
In Experiment 1, four developmentally delayed adolescents were taught an A-B matching-to-sample task with nonidentical stimuli: given Sample A1, select Comparison B1; given A2, select B2. During nonreinforced test trials, appropriate matching occurred when B stimuli appeared as samples and A stimuli as comparisons, i.e., the sample and comparison functions were symmetrical (B-A matching). During A-B or B-A matching test trials in which familiar samples and correct comparisons were presented along with novel comparisons, the subjects selected the correct comparisons. In tests with familiar samples and both incorrect and novel comparisons, subjects selected the novel comparisons, demonstrating control by both positive ("matching") and negative ("nonmatching") stimulus relations in A-B and B-A arrays. In Experiment 2, 12 developmentally delayed subjects were taught a two-stage arbitrary-matching task (e.g., A-B, C-B matching). Test sessions showed sample-comparison symmetry (e.g., B-A, B-C matching) and derived sample-comparison relations (e.g., A-C, C-A matching) for 11 subjects. These subjects also demonstrated control by positive and negative stimulus relations in the derived relations.  相似文献   

5.
After children in Experiments 1 and 2 learned identity matching or oddity, control by sample-comparison relations was assessed. Tests for generalized control displayed novel samples and two comparison stimuli, one identical to the sample. Specific relations were tested with identical or nonidentical sample-comparison stimuli from one set of stimuli and substitute comparisons from either the other training set or from a novel set. When tests displayed identical stimuli, patterns of comparison selection suggested control by generalized identity and oddity. However, selection patterns varied when stimuli were nonidentical and familiar or novel substitute comparisons were used. Therefore, control by specific relations is not a precondition for generalized identity and oddity. One set of training stimuli was used in Experiment 3, and generalized performances occurred again. Moreover, control by specific relations was shown by the oddity subjects and 2 of 6 identity subjects. Generalized and specific control may therefore exist simultaneously. In Experiment 4, selections were irregular on tests displaying substitute comparisons and samples and familiar comparison stimuli; this finding supported the relational account of specific sample-comparison control found in Experiment 3.  相似文献   

6.
A large class of rank tests, which includes the familiar sign test and the Wilcoxon signed-ranks test, is described and discussed. This class of distribution-free tests provides a flexible basis for testing research hypotheses of various forms. Exact small sample and approximate large sample procedures are considered. Applications of these procedures are presented, including simple numerical examples.The authors wish to acknowledge the constructive comments by the reviewers.  相似文献   

7.
Tests of the null hypothesis for comparisons involving sample means use the t test when the conditions of the z test cannot be met. The 2 tests have different rationales and can lead to different conclusions regarding significance. In the present study, the authors compared the properties of t and z in simulation runs. The differences in the results are a result of fluctuations in the t test sample variances that do not exist in the z test, and those differences lead to differences in designating the significance of comparisons.  相似文献   

8.
Three experiments used postclass formation within-class preference test performances to evaluate the effects of nodal distance on the relatedness of stimuli in equivalence classes. In Experiment 1, two 2-node four-member equivalence classes were established using the simultaneous protocol in which all of the baseline relations were trained together, after which all emergent relations probes were presented together. All training and testing was done using match-to-sample trials that contained two comparisons. After class formation, the effects of nodal distance were evaluated using within-class preference tests that contained samples and both comparisons from the same class. These tests yielded inconsistent performances for most participants. Experiment 2 replicated Experiment 1, but a third null comparison was used on all trials during class formation. Thereafter, virtually all of the within-class probes, for all participants, evoked performances that were consistent with the predicted effects of nodal distance, that is, the selection of comparisons that were nodally closer to the samples. It appears, then, that the establishment of the equivalence classes with a third null comparison induced control by nodal structure of the classes. Experiment 3 demonstrated the generality of these findings with larger classes that contained more nodal separations, that is, three-node five-member classes. Emergent-relations tests conducted immediately after the within-class tests showed the classes to be intact. Thus, the differential relatedness of stimuli in a class or their interchangeability depended on the content of a test trial: within-class probes occasioned responding indicative of differential strength among the stimuli in the class, while cross-class tests occasioned responding indicative of interchangeability of stimuli in the same class.  相似文献   

9.
6 experienced orienteers were subject to a VO2max treadmill test, two days prior to undertaking two tests of visual perception. One test was conducted while the subjects were in a rested state while the other was conducted while they were under a state of fatigue. Fatigue was defined as a state in which the subjects were working at or above their anaerobic threshold which had been determined previously from their VO2max test. The tests in both the fatigue and rest condition were of a similar nature, that is, the subjects were presented slides of orienteering checkpoints at regular intervals followed by a slide showing a set of questions which the subjects had to answer verbally. Two sets of slides were employed and these were approximately counterbalanced between both subjects and conditions. Points were awarded for the correct answers and the two conditions were then compared. The Wilcoxon test for two correlated samples was used and showed a significant difference between the fatigue and rest scores at p less than 0.05. The data suggest that under the influence of fatigue, an orienteer's ability to perceive visual information is greatly impaired.  相似文献   

10.
Although assessment use is a professional activity recognized by every major counseling organization, little is known about which assessments are used in counseling. In this study, 926 respondents from a random national sample of counselors reported their use of personality, projective, career, intelligence/cognitive, educational/achievement, clinical/behavioral, and environmental/interpersonal tests. Test rankings by frequency of use and comparisons by type of counselor and type of test are reported. Implications for policy and practice are discussed.  相似文献   

11.
HONESTY TESTING FOR PERSONNEL SELECTION: A REVIEW AND CRITIQUE   总被引:1,自引:0,他引:1  
Paper and pencil predictors of employee theft are described and studies of validity, reliability, and adverse impact of these tests are examined. Validity studies for 10 tests were grouped into 5 categories: comparisons with polygraph examination results, correlations with admissions of past theft, predictive studies using future job behaviors as criteria, comparisons of shrinkage rates before and after the introduction of a testing program, and comparisons of test scores of groups known to be dishonest with groups representing the general population. While positive correlations were consistently found, a variety of methodological differences between studies were identified which make the direct comparison of test validities suspect. High reliabilities are consistently reported, and test score comparisons by race and sex generally report no differences. Ethical issues in honesty test usage are considered and future research needs are identified.  相似文献   

12.
Two experiments were conducted using match-to-sample methodologies in an effort to model lexical classes, which include both arbitrary and perceptual relations between class members. Training in both experiments used a one-to-many mapping procedure with nonsense syllables as samples and eight sets of abstract stimuli as comparisons. These abstract stimuli differed along a number of dimensions, four of which were critical to the experimenter-defined class membership. Stimuli in some comparison sets included only one of the class-defining features, but stimuli in other sets included two, three, or all four of the critical features. After mastery of the baseline training, three types of probe tests were conducted: symmetry, transitivity/equivalence, and novel probe tests in which the training nonsense syllables served as samples, and comparisons were novel abstract stimuli that included one or more of the class-defining features. Symmetry and transitivity/equivalence probe tests showed that the stimuli used in training became members of equivalence classes. The novel stimuli also became class members on the basis of inclusion of any of the critical features. Thus these probe tests revealed the formation of open-ended generalized equivalence classes. In addition, typicality effects were observed such that comparison sets with more critical features were learned with fewer errors, responded to more rapidly, and judged to be better exemplars of the class. Contingency-shaped stimulus classes established through a match-to-sample procedure thus show several important behavioral similarities to natural lexical categories.  相似文献   

13.
Researchers (e.g., Ironson, 1982; Tenopyr, 1990) have suggested that item bias investigators equate subgroups on external criteria such as job performance rather than total test scores before considering subgroup passing rates on test items. In a study comparing these two approaches to studies of item bias, we found little evidence of bias using total test score as the estimate of overall examinee ability, but nearly all items were biased in comparisons of white and African-American subgroups on Numerical, Verbal, and Mechanical Reasoning tests and in male-female comparisons on a Mechanical Reasoning test when job performance was used to select "equally able" examinees. However, the use of job performance as the ability index is analogous to performance-based approaches to test bias (Hartigan & Wigdor, 1989; Thorndike, 1971) and directly equivalent to the Darlington (1971) and Cole (1973) test bias definition, the logical inconsistencies of which have been previously described (Hunter & Schmidt, 1976; Peterson & Novick, 1976). We conclude that performance matching as a basis of forming "equal ability" groups is inappropriate.  相似文献   

14.
In Experiment 1, 6 college students were given generalization tests using 25 line lengths as samples with a long line, a short line, and a “neither” option as comparisons. The neither option was to be used if a sample did not go with the other comparisons. Then, four-member equivalence classes were formed. Class 1 included three nonsense words and the short line. Class 2 included three other nonsense words and the long line. After repeating the generalization test for line length, additional tests were conducted using members of the equivalence classes (i.e., nonsense words and lines) as comparisons and intermediate-length lines as samples. All Class 2 comparisons were selected in the presence of the test lines that also evoked the selection of the long line in the generalization test that had been given before equivalence class formation. Class 1 yielded complementary findings. Thus, the preclass primary generalization gradient predicted which test lines acted as members of each equivalence class. Regardless of using comparisons that were nonsense words or lines, the post-class-formation gradients overlapped, showing the substitutability of class members. Experiment 2 assessed the discriminability of the intermediate-length test lines from the Class 1 (shortest) and Class 2 (longest) lines. The test lines that functioned as members of an equivalence class were discriminable from the line that was a member of the same class by training. Thus, these test lines also acted as members of a dimensionally defined class of “long” or “short” lines. Extension of an equivalence class, then, involved its merger with a dimensionally defined class, which converted a close-ended class to an open-ended class. These data suggest a means of predicting class membership in naturally occurring categories.  相似文献   

15.
Three adult subjects were taught the following two-sample, two-comparison conditional discriminations (each sample is shown with its positive and negative comparison, in that order): A1-B1B2, A2-B2B1; B1-C1C2, B2-C2C1; and C1-D1D2, C2-D2D1. A teaching procedure was designed to encourage control by negative comparisons. Subjects were then tested for emergent performances that would indicate whether the baseline conditional discriminations were reflexive, symmetric, and transitive. The tests documented the emergence of two classes of equivalent stimuli: A1, B2, C1, D2 and A2, B1, C2, D1. These were the classes to be expected if the negative comparisons were the controlling comparisons in the baseline conditional discriminations. The negative comparisons, however, were not the comparisons that subjects were recorded as having chosen in the baseline conditional discriminations. Differential test results confirmed predictions arising from a stimulus-control analysis: In reflexivity tests (AA, BB, CC, DD), subjects chose comparisons that differed from the sample; one-node transitivity (AC, BD) and "equivalence" (CA, DB) tests also yielded results that were the opposite of those to be expected from control by positive comparisons; symmetry tests (BA, CB, DC), two-node transitivity (AD) tests, and two-node "equivalence" (DA) tests yielded results that were to be expected from control by either positive or negative comparisons.  相似文献   

16.
A detailed analysis is presented of the ways in which control by the negative stimulus in two-comparison conditional discriminations may be expected to affect the outcome of tests for the properties of equivalence relations. Control by the negative stimulus should produce the following results: (a) no observable effect on symmetry tests; (b) reflexivity test results should look like “oddity” rather than “identity”; and (c) transitivity tests that involve an odd number of nodes should yield results that are 100% opposite to tests that involve an even number of nodes. The analysis also considers the effects of variation in the type of comparison-stimulus control between and within baseline conditional discriminations. Methods are suggested for experimentally regulating the type of control, and for verifying the predictions that the analysis generates. If suggested experiments continue to support the analysis, investigators who use two-comparison conditional discriminations to study equivalence relations will either have to control explicitly whether the positive or the negative comparison governs their subjects' choices, or they will have to abandon two comparisons and use three or more comparisons instead.  相似文献   

17.
以生活满意度量表为例,运用实证性因素分析,考察在中国文化下网络测验和传统纸笔测验之间的测量不变性。结果显示,网络测验和纸笔测验之间存在弱不变性,即网络测验和纸笔测验有着相同的测量单位;但网络测验和纸笔测验只存在部分的强不变性和部分的严格不变性,测验实施环境对结果的影响不可忽视。该研究表明,恰当设计的网络测验是可靠的,同时还提示,当一个测验在不同情境下运用时,检验测量不变性十分必要  相似文献   

18.
A novel classification framework for clinical decision making that uses an Extremely Randomized Tree (ERT) based feature selection and a Diverse Intensified Strawberry Optimized Neural network (DISON) is proposed. DISON is a Feed Forward Artificial Neural Network where the optimization of weights and bias is done using a two phase training strategy. Two algorithms namely Strawberry Plant Optimization (SPO) algorithm and Gradient-descent Back-propagation algorithm are used sequentially to identify the optimum weights and bias. The novel two phase training method and the stochastic duplicate-elimination strategy of SPO helps in addressing the issue of local optima associated with conventional neural networks. The relevant attributes are selected based on the feature importance values computed using an ERT classifier.Vertebral Column, Pima Indian diabetes (PID), Cleveland Heart disease (CHD) and Statlog Heart disease (SHD) datasets from the University of California Irvine machine learning repository are used for experimentation. The framework has achieved an accuracy of 87.17% for Vertebral Column, 90.92% for PID, 93.67% for CHD and 94.5% for SHD. The classifier performance has been compared with existing works and is found to be competitive in terms of accuracy, sensitivity and specificity. Wilcoxon test confirms the statistical superiority of the proposed method.  相似文献   

19.
Two three-member classes were formed by training AB and BC using a conditional discrimination procedure. The A and B stimuli were nonsense syllables, and the C stimuli were sets of “short” or “long” lines. To test for equivalence, C1 or C2 was presented as a sample with A1 and A2 as comparisons. Once the class-related comparison was chosen consistently, different line lengths were substituted for the training lines in the CA tests. In general, the likelihood of choosing a given comparison was an inverse function of the difference in the length of the test line from the training line. Stimuli in an equivalence class became functionally related not only to each other but also to novel stimuli that resembled a member of the equivalence class. The combination of primary generalization and equivalence class formation, then, can serve as a model to account for the development of naturally occurring categories.  相似文献   

20.
A simple procedure for testing heterogeneity of variance is developed which generalizes readily to complex, multi-factor experimental designs. Monte Carlo Studies indicate that the Z-variance test statistic presented here yields results equivalent to other familiar tests for heterogeneity of variance in simple one-way designs where comparisons are feasible. The primary advantage of the Z-variance test is in the analysis of factorial effects on sample variances in more complex designs. An example involving a three-way factorial design is presented.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号