首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Tests can be used either diagnostically (i.e., to confirm or rule out the presence of a condition in people suspected of having it) or as a screening instrument (determining who in a large group of people has the condition and often when those people are unaware of it or unwilling to admit to it). Tests that may be useful and accurate for diagnosis may actually do more harm than good when used as a screening instrument. The reason is that the proportion of false negatives may be high when the prevalence is high, and the proportion of false positives tends to be high when the prevalence of the condition is low (the usual situation with screening tests). My first aim of this article is to discuss the effects of the base rate, or prevalence, of a disorder on the accuracy of test results. My second aim is to review some of the many diagnostic efficiency statistics that can be derived from a 2 × 2 table, including the overall correct classification rate, kappa, phi, the odds ratio, positive and negative predictive power and some variants of them, and likelihood ratios. In the last part of this article, I review the recent Standards for Reporting of Diagnostic Accuracy guidelines (Bossuyt et al., 2003) for reporting the results of diagnostic tests and extend them to cover the types of tests used by psychologists.  相似文献   

2.
False consensus refers to an egocentric bias that occurs when people estimate consensus for their own behaviors. Specifically, the false consensus hypothesis holds that people who engage in a given behavior will estimate that behavior to be more common than it is estimated to be by people who engage in alternative behaviors. A meta-analysis was conducted upon 115 tests of this hypothesis. The combined effects of the tests of the false consensus hypothesis were highly statistically significant and of moderate magnitude. Further, the 115 tests of false consensus appear to be relatively heterogeneous in terms of significance levels and effect sizes. Correlational analyses and focused comparisons indicate that the false consensus effect does not appear to be influenced by the generality of the reference population, nor by the difference between alternative choices in actual consensus. However, the significance and magnitude of the false consensus effect was significantly predicted by the number of behavioral choices/estimates subjects had to make, and the sequence of measurement of choices and estimates. These patterns of results are interpreted as being inconsistent with the self-presentational, motivational explanation for the false consensus effect.  相似文献   

3.
One of the earliest instruments to screen for problem gambling, the Twenty Questions (20Q), was developed within Gamblers Anonymous. This instrument has not received serious research attention, however, and its psychometric properties are generally unknown. This study reports reliability and validity data for this instrument in 3 independent samples totaling 456 participants: two samples of problem gamblers in treatment and a non-treatment sample of problem gamblers. The Twenty Questions was shown to possess high reliability as measured by Cronbach’s alpha. Concurrent, convergent and predictive validity of the 20Q supported the use of this instrument as an acceptable screening instrument. Classification analyses indicated that the 20Q is comparable to the DSM-IV diagnostic criteria for pathological gambling in specificity, sensitivity and rates of false negatives and false positives. The 20Q appears to be a reliable and valid measure of problem gambling and warrants continued research attention.  相似文献   

4.
Many critical search tasks, such as airport and medical screening, involve searching for targets that are rarely present. These low-prevalence targets are associated with extremely high miss rates Wolfe, Horowitz, & Kenner (Nature, 435, 439?C440, 2005). The inflated miss rates are caused by a criterion shift, likely due to observers attempting to equate the numbers of misses and false alarms. This equalizing strategy results in a neutral criterion at 50?% target prevalence, but leads to a higher proportion of misses for low-prevalence targets. In the present study, we manipulated participants?? perceived number of misses through explicit false feedback. As predicted, the participants in the false-feedback condition committed a higher number of false alarms due to a shifted criterion. Importantly, the participants in this condition were also more successful in detecting targets. These results highlight the importance of perceived prevalence in target search tasks.  相似文献   

5.
Bernard Williams argues that human mortality is a good thing because living forever would necessarily be intolerably boring. His argument is often attacked for unfoundedly proposing asymmetrical requirements on the desirability of living for mortal and immortal lives. My first aim in this paper is to advance a new interpretation of Williams' argument that avoids these objections, drawing in part on some of his other writings to contextualize it. My second aim is to show how even the best version of his argument only supports a somewhat weaker thesis: it may be possible for some people with certain special psychological features to enjoy an immortal life, but no one has good reason to bet on being such a person.  相似文献   

6.
In a comparison of 2 treatments, if outcome scores are denoted by X in 1 condition and by Y in the other, stochastic equality is defined as P(X < Y) = P(X > Y). Tests of stochastic equality can be affected by characteristics of the distributions being compared, such as heterogeneity of variance. Thus, various robust tests of stochastic equality have been proposed and are evaluated here using a Monte Carlo study with sample sizes ranging from 10 to 30. Three robust tests are identified that perform well in Type I error rates and power except when extremely skewed data co-occur with very small n. When tests of stochastic equality might be preferred to tests of means is also considered.  相似文献   

7.
Multiple-choice tests are used frequently in higher education without much consideration of the impact this form of assessment has on learning. Multiple-choice testing enhances retention of the material tested (the testing effect); however, unlike other tests, multiple-choice can also be detrimental because it exposes students to misinformation in the form of lures. The selection of lures can lead students to acquire false knowledge (Roediger & Marsh, 2005). The present research investigated whether feedback could be used to boost the positive effects and reduce the negative effects of multiple-choice testing. Subjects studied passages and then received a multiple-choice test with immediate feedback, delayed feedback, or no feedback. In comparison with the no-feedback condition, both immediate and delayed feedback increased the proportion of correct responses and reduced the proportion of intrusions (i.e., lure responses from the initial multiple-choice test) on a delayed cued recall test. Educators should provide feedback when using multiple-choice tests.  相似文献   

8.
The aim of this research was to determine to what extent a psychopath screening device (the APSD) is useful in forensic assessments to predict general and violent offending. For this purpose, a cross-sectional study was done and 238 young people serving a sentence were assessed. The gold standard instrument used to measure psychopathy was the Psychopathy Checklist: Youth Version (PCL:YV; Forth, Kosson & Hare, 2003). The results indicate that the association found between the screening device scores and several indicators of risk is low if compared with those obtained with the PCL:YV, suggesting that it is less useful as a tool in order to predict offending or violent offences. However, an Area Under the Curve of .784 and a validity index of 62.5 support its use as a screening device or as a preliminary approach to assess psychopathy in this population. The usefulness of this instrument to make assessments with young people in the forensic setting is discussed.  相似文献   

9.
Target prevalence influences visual search behavior. At low target prevalence, miss rates are high and false alarms are low, while the opposite is true at high prevalence. Several models of search aim to describe search behavior, one of which has been specifically intended to model search at varying prevalence levels. The multiple decision model (Wolfe & Van Wert, Current Biology, 20(2), 121-–124, 2010) posits that all searches that end before the observer detects a target result in a target-absent response. However, researchers have found very high false alarms in high-prevalence searches, suggesting that prevalence rates may be used as a source of information to make “educated guesses” after search termination. Here, we further examine the ability for prevalence level and knowledge gained during visual search to influence guessing rates. We manipulate target prevalence and the amount of information that an observer accumulates about a search display prior to making a response to test if these sources of evidence are used to inform target present guess rates. We find that observers use both information about target prevalence rates and information about the proportion of the array inspected prior to making a response allowing them to make an informed and statistically driven guess about the target’s presence.  相似文献   

10.
The aim of this research was to compare two different case-identification designs: (a) a one-stage anonymous design using the Eating Disorders Examination-Questionnaire (EDE-Q; Fairburn & Beglin, 1994) as diagnostic instrument and (b) a two-stage-non-anonymous design using the Eating Attitudes Test (EAT; Garner & Garfinkel, 1979) and the EDE-Q as screening instruments and the clinical interview Eating Disorders Examination (EDE; Fairbumrn & Cooper, 1993) as diagnostic instrument, in the estimation of eating disorders prevalence in community samples. Both epidemiological designs were compared in: eating disorders prevalence, population at risk, and weekly frequency of associated symptomatology (binge eating episodes, self-vomiting) within a sample of 559 scholars (14 to 18 year-old males and females) studying in the region of Madrid. Eating disorders prevalence estimation using single-stage design was 6.2%, and 3% using the two-stage design; however, these differences were not significant (p = .067). No significant differences between the two procedures were found either in population at risk or in weekly frequency of reported self-vomiting. Reported binge eating episodes were higher in the one-stage design. The use of a two-stage procedure with clinical interview (vs. questionnaire) leads to a better understanding of the items (specially the most ambiguous ones) and thus, to a more accurate prevalence estimation.  相似文献   

11.
What Is Terrorism?   总被引:1,自引:0,他引:1  
ABSTRACT My aim in this paper is not to try to formulate the meaning the word ‘terrorism’has in ordinary use; the word is used in so many different, even incompatible ways, that such an enterprise would quickly prove futile. My aim is rather to try for a definition that captures the trait, or traits, of terrorism which cause most of us to view it with moral repugnance. I discuss the following questions: Is the historical connection of terrorism with terror to be preserved on the conceptual level, or relegated to the psychology and sociology of terrorism? Does mere infliction of terror qualify as terrorism, so that we can speak of non-violent terrorism? If terrorism is a type of violence, does it have to be against persons, or should violence against property also count? In what sense can terrorism be described as indiscriminate violence? Should we use the word only in a political context? In such a context, can we speak of ‘state terrorism’, or should the word be restricted to actions not sanctioned by law? Is the terrorist necessarily oblivious to moral considerations, as those who define terrorism in terms of antinomianism imply? My answers to these questions lead up to the following definition: terrorism is the deliberate use of violence, or threat of its use, against innocent people, with the aim of intimidating them, or other people, into a course of action they otherwise would not take.  相似文献   

12.
Twelve psychological tests including a standardized questionnaire were administered to 20 male viscose rayon workers with long-term exposure to carbon disulfide and to 152 nonexposed men. With the method of multiple discriminant analysis the number of tests was reduced from 12 to 5 and the number of variables from 30 to 7. The variable setting of the obtained discriminant function contained measures of different types of psychomotor performances, emotional behaviour and subjective symptoms. Sensitivity and specificity of the tests and the criteria for a detected effected were evaluated a posteriori. In general, the sensitivity of the methods was better than its specificity. Sufficient specificity could be obtained when a higher probability level for belonging to the exposed group was applied as the criterion, but even then, the application of other, reference diagnostic methods seems necessary to separate the false positive cases.  相似文献   

13.
Automated diagnostic aids prone to false alarms often produce poorer human performance in signal detection tasks than equally reliable miss-prone aids. However, it is not yet clear whether this is attributable to differences in the perceptual salience of the automated aids' misses and false alarms or is the result of inherent differences in operators' cognitive responses to different forms of automation error. The present experiments therefore examined the effects of automation false alarms and misses on human performance under conditions in which the different forms of error were matched in their perceptual characteristics. Young adult participants performed a simulated baggage x-ray screening task while assisted by an automated diagnostic aid. Judgments from the aid were rendered as text messages presented at the onset of each trial, and every trial was followed by a second text message providing response feedback. Thus, misses and false alarms from the aid were matched for their perceptual salience. Experiment 1 found that even under these conditions, false alarms from the aid produced poorer human performance and engendered lower automation use than misses from the aid. Experiment 2, however, found that the asymmetry between misses and false alarms was reduced when the aid's false alarms were framed as neutral messages rather than explicit misjudgments. Results suggest that automation false alarms and misses differ in their inherent cognitive salience and imply that changes in diagnosis framing may allow designers to encourage better use of imperfectly reliable automated aids.  相似文献   

14.
Even if people acknowledge that misinformation is incorrect after a correction has been presented, their feelings towards the source of the misinformation can remain unchanged. The current study investigated whether participants reduce their support of Republican and Democratic politicians when the prevalence of misinformation disseminated by the politicians appears to be high in comparison to the prevalence of their factual statements. We presented U.S. participants either with (1) equal numbers of false and factual statements from political candidates or (2) disproportionately more false than factual statements. Participants received fact-checks as to whether items were true or false, then rerated both their belief in the statements as well as their feelings towards the candidate. Results indicated that when corrected misinformation was presented alongside equal presentations of affirmed factual statements, participants reduced their belief in the misinformation but did not reduce their feelings towards the politician. However, if there was considerably more misinformation retracted than factual statements affirmed, feelings towards both Republican and Democratic figures were reduced—although the observed effect size was extremely small.  相似文献   

15.
There is a high prevalence of personality disorder in most prison populations. Many pass through the system undiagnosed. A screening instrument would improve identification. This study examined the screening properties of the Personality Diagnostic Questionnaire-4+ (PDQ-4+) in prisoners convicted of violent and sexual offenses. A sample of British prisoners completed the self-report PDQ-4+ and were interviewed using the Structured Clinical Interview for DSM-IV Axis II disorders. When used to generate a total score, the PDQ-4+ had an acceptable overall accuracy as measured by the area under the Receiver Operating Characteristics (ROC) curve. The PDQ-4+ appears to have the properties suitable for use as a screening instrument, particularly when screening for the presence or absence of personality disorder rather than for individual personality disorder categories. A graph is presented from which choices of cut-off score for different combinations of sensitivity and specificity can be made. A cut off total score of 25 or above yielded near optimal sensitivity and specificity. The suggested cut off score for this population is lower than that previously suggested.  相似文献   

16.
There have been many discussions of how Type I errors should be controlled when many hypotheses are tested (e.g., all possible comparisons of means, correlations, proportions, the coefficients in hierarchical models, etc.). By and large, researchers have adopted familywise (FWER) control, though this practice certainly is not universal. Familywise control is intended to deal with the multiplicity issue of computing many tests of significance, yet such control is conservative--that is, less powerful--compared to per test/hypothesis control. The purpose of our article is to introduce the readership, particularly those readers familiar with issues related to controlling Type I errors when many tests of significance are computed, to newer methods that provide protection from the effects of multiple testing, yet are more powerful than familywise controlling methods. Specifically, we introduce a number of procedures that control the k-FWER. These methods--say, 2-FWER instead of 1-FWER (i.e., FWER)--are equivalent to specifying that the probability of 2 or more false rejections is controlled at .05, whereas FWER controls the probability of any (i.e., 1 or more) false rejections at .05. 2-FWER implicitly tolerates 1 false rejection and makes no explicit attempt to control the probability of its occurrence, unlike FWER, which tolerates no false rejections at all. More generally, k-FWER tolerates k - 1 false rejections, but controls the probability of k or more false rejections at α =.05. We demonstrate with two published data sets how more hypotheses can be rejected with k-FWER methods compared to FWER control.  相似文献   

17.
Tsai WC  Soong WT  Shyu YI 《Autism》2012,16(4):340-349
No feasible screening instrument is available for early detection of children with autism in Taiwan. The existing instruments may not be appropriate for use in Taiwan due to different health care systems and child-rearing cultures. The purpose of this study was to develop and test a screening questionnaire for generic autism. The initial 18-item screening questionnaire was developed by a child psychiatrist using face-to-face interviews with 10 families of children with autism and then tested on a sample of families of 18 children with autism and of 59 typically developing children. Of these 18 items, 15 had fair or better item discrimination (kappa >0.20) and were selected for the revised screening questionnaire. In the revised questionnaire, cutoff scores of 5 and 6 offered 100% sensitivity and 96.5% specificity, with the area under the receiver operating characteristic curve of 0.983. The revised screening instrument has high sensitivity and specificity, making it potentially useful for screening Taiwanese children at risk for autism. This instrument should be further tested in a population-based study.  相似文献   

18.
When people identify an error in their initial judgment, they typically try to correct it. But, in some cases, they choose not to—even when they know, in the moment, that they are being irrational or making a mistake. A baseball fan may know that he cannot affect the pitcher from his living room but still be reluctant to say “no‐hitter.” A person may learn that flying in an airplane is statistically safer than driving a car and still refuse to fly. Dual‐process models of judgment and decision making often implicitly assume that if an error is detected, it will be corrected. Recent work suggests, however, that models should decouple error detection and correction. Indeed, people can explicitly recognize that their intuitive judgment is wrong but, nevertheless, stick with it, a phenomenon known as acquiescence. My goals are to offer criteria for identifying acquiescence, consider why people acquiesce even when it incurs a cost, discuss how lessons that are learned in cases when acquiescence is clearly identified can be exported to cases when acquiescence may be harder to establish, and, more broadly, describe the implications of a model that decouples error detection and error correction.  相似文献   

19.
Three‐ and 4‐year‐old children were tested using videos of puppets in various versions of a theory of mind change‐of‐location situation, in order to answer several questions about what children are doing when they pass false belief tests. To investigate whether children were guessing or confidently choosing their answer to the test question, a condition in which children were forced to guess was also included, and measures of uncertainty were compared across conditions. To investigate whether children were using simpler strategies than an understanding of false belief to pass the test, we teased apart the seeing‐knowing confound in the traditional change‐of‐location task. We also investigated relations between children's performance on true and false belief tests. Results indicated that children appeared to be deliberately choosing, not guessing, in the false belief tasks. Children performed just as well whether the protagonist gained information about the object visually or verbally, indicating that children were not using a simple rule based on seeing to predict the protagonist's behaviour. A true belief condition was significantly easier for children than a false belief condition as long as it was of low processing demands. Children's success rate on the different versions of the standard false belief task was influenced by factors such as processing demands of the stories and the child's verbal abilities.  相似文献   

20.
In reply to certain cosmological arguments for theism, critics regularly argue that the causal principle ex nihilo nihil fit may be false. Various theistic counter-replies to this challenge have emerged. One type of strategy is to double down on ex nihilo nihil fit. Another, very different strategy of counter-reply is to grant for the sake of argument that the principle is false, while maintaining that sound cosmological arguments can be formulated even with this concession in place. Notably, one can employ a weaker opening premise formulated in modal terms, proceeding for instance from the proposition that for any contingent object coming into existence it is at least possible that it (or a duplicate) have a cause. My aim here is to try out a related strategy for weakening the relevant opening premise. Granting that it is possible for a contingent object to come into existence out of nothing without a cause, I proceed from the extremely modest claim that the obtaining of exceptionless (or nearly exceptionless) longstanding contingent regularities demands an explanation. As such, the contingent regularity that empirically accessible macro-level contingent objects do not pop into existence causelessly demands explanation. And as it turns out, that explanation will have to be in terms of an object or objects possessed of at least some of the traditional divine attributes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号