首页 | 本学科首页   官方微博 | 高级检索  
文章检索
  按 检索   检索词:      
出版年份:   被引次数:   他引次数: 提示:输入*表示无穷大
  收费全文   417篇
  免费   46篇
  国内免费   107篇
  570篇
  2024年   1篇
  2023年   11篇
  2022年   19篇
  2021年   31篇
  2020年   25篇
  2019年   36篇
  2018年   25篇
  2017年   28篇
  2016年   33篇
  2015年   14篇
  2014年   28篇
  2013年   32篇
  2012年   18篇
  2011年   14篇
  2010年   11篇
  2009年   8篇
  2008年   15篇
  2007年   19篇
  2006年   22篇
  2005年   19篇
  2004年   5篇
  2003年   8篇
  2002年   9篇
  2001年   13篇
  2000年   8篇
  1999年   10篇
  1998年   12篇
  1997年   6篇
  1996年   6篇
  1995年   5篇
  1994年   8篇
  1993年   7篇
  1992年   8篇
  1991年   7篇
  1990年   5篇
  1989年   7篇
  1988年   6篇
  1987年   6篇
  1986年   5篇
  1985年   4篇
  1984年   5篇
  1983年   2篇
  1982年   1篇
  1981年   3篇
  1979年   1篇
  1977年   2篇
  1976年   1篇
  1975年   1篇
排序方式: 共有570条查询结果,搜索用时 31 毫秒
561.
计算机形式的测验能够记录考生在测验中的题目作答时间(Response Time, RT),作为一种重要的辅助信息来源,RT对于测验开发和管理具有重要的价值,特别是在计算机化自适应测验(Computerized Adaptive Testing, CAT)领域。本文简要介绍了RT在CAT选题方面应用并作以简评,分析了这些技术在实践中的可行性。最后,探讨了当前RT应用于CAT选题存在的问题以及可以进一步开展的研究方向。  相似文献   
562.
In structural equation modeling applications, parcels—averages or sums of subsets of item scores—are often used as indicators of latent constructs. Parcel-allocation variability (PAV) is variability in results that arises within sample across alternative item-to-parcel allocations. PAV can manifest in all results of a parcel-level model (e.g., model fit, parameter estimates, standard errors, and inferential decisions). It is a source of uncertainty in parcel-level model results that can be investigated, reported, and accounted for. Failing to do so raises representativeness and replicability concerns. However, in recent methodological literature (Cole, Perkins, &; Zelkowitz, 2016 Cole, D. A., Perkins, C. E., &; Zelkowitz, R. L. (2016). Impact of homogeneous and heterogeneous parceling strategies when latent variables represent multidimensional constructs. Psychological Methods, 21(2), 164174. doi: 10.1037/met0000047[Crossref], [PubMed], [Web of Science ®] [Google Scholar]; Little, Rhemtulla, Gibson, &; Shoemann, 2013 Little, T. D., Rhemtulla, M., Gibson, K., &; Schoemann, A. M. (2013). Why the items versus parcels controversy needn't be one. Psychological Methods, 18(3), 285300. doi: 10.1037/a0033266[Crossref], [PubMed], [Web of Science ®] [Google Scholar]; Marsh, Ludtke, Nagengast, Morin, &; von Davier, 2013 Marsh, H. W., Lüdtke, O., Nagengast, B., Morin, A., &; von Davier, M. (2013). Why item parcels are (almost) never appropriate: Two wrongs do not make a right—camouflaging misspecification with item parcels in CFA models. Psychological Methods, 18(3), 257284. doi: 10.1037/a0032773[Crossref], [PubMed], [Web of Science ®] [Google Scholar]; Rhemtulla, 2016 Rhemtulla, M. (2016). Population performance of SEM parceling strategies under measurement and structural model misspecification. Psychological Methods, 21(3), 348368. doi: 10.1037/met0000072[Crossref], [PubMed], [Web of Science ®] [Google Scholar]) parceling has been justified and recommended in several situations without quantifying or accounting for PAV. In this article, we explain and demonstrate problems with these rationales. Overall, we find that: (1) using a purposive parceling algorithm for a multidimensional construct does not avoid PAV; (2) passing a test of unidimensionality of the item-level model need not avoid PAV; and (3) a desire to improve power for detecting structural misspecification does not warrant parceling without addressing PAV; we show how to simultaneously avoid PAV and obtain even higher power by comparing item-level models differing in structural constraints. Implications for practice are discussed.  相似文献   
563.
564.
Abstract

For adequate modeling of missing responses, a thorough understanding of the nonresponse mechanisms is vital. As a large number of major testing programs are in the process or already have been moving to computer-based assessment, a rich body of additional data on examinee behavior becomes easily accessible. These additional data may contain valuable information on the processes associated with nonresponse. Bringing together research on item omissions with approaches for modeling response time data, we propose a framework for simultaneously modeling response behavior and omission behavior utilizing timing information for both. As such, the proposed model allows (a) to gain a deeper understanding of response and nonresponse behavior in general and, in particular, of the processes underlying item omissions in LSAs, (b) to model the processes determining the time examinees require to generate a response or to omit an item, and (c) to account for nonignorable item omissions. Parameter recovery of the proposed model is studied within a simulation study. An illustration of the model by means of an application to real data is provided.  相似文献   
565.
The study examined the relationship between examinees’ test-taking effort and their accuracy rate on items from the PISA 2015 assessment. The 10% normative threshold method was applied on Science multiple-choice items in the Cyprus sample to detect rapid guessing behavior. Results showed that the extent of rapid guessing across simple and complex multiple-choice items was on average less than 6% per item. Rapid guessers were identified, and for most items their accuracy was lower than the accuracy for students engaging in solution-based behavior. A number of plausible explanations were graphically evaluated for items for which accuracy was higher for the rapid guessing subgroup. Overall, this empirical investigation presents original evidence on test-taking effort as measured by response time in PISA items and tests propositions of Wise’s (2017 Wise, S. L. (2017). Rapid‐guessing behavior: Its identification, interpretation, and implications. Educational Measurement: Issues and Practice, 36(4), 5261. doi:10.1111/emip.12165[Crossref], [Web of Science ®] [Google Scholar]) Test-Taking Theory.  相似文献   
566.
This research examined correlation estimates between latent abilities when using the two-dimensional and three-dimensional compensatory and noncompensatory item response theory models. Simulation study results showed that the recovery of the latent correlation was best when the test contained 100% of simple structure items for all models and conditions. When a test measured weakly discriminated dimensions, it became harder to recover the latent correlation. Results also showed that increasing the sample size, test length, or using simpler models (i.e., two-parameter logistic rather than three-parameter logistic, compensatory rather than noncompensatory) could improve the recovery of latent correlation.  相似文献   
567.
Abstract

Differential item functioning (DIF) is a pernicious statistical issue that can mask true group differences on a target latent construct. A considerable amount of research has focused on evaluating methods for testing DIF, such as using likelihood ratio tests in item response theory (IRT). Most of this research has focused on the asymptotic properties of DIF testing, in part because many latent variable methods require large samples to obtain stable parameter estimates. Much less research has evaluated these methods in small sample sizes despite the fact that many social and behavioral scientists frequently encounter small samples in practice. In this article, we examine the extent to which model complexity—the number of model parameters estimated simultaneously—affects the recovery of DIF in small samples. We compare three models that vary in complexity: logistic regression with sum scores, the 1-parameter logistic IRT model, and the 2-parameter logistic IRT model. We expected that logistic regression with sum scores and the 1-parameter logistic IRT model would more accurately estimate DIF because these models yielded more stable estimates despite being misspecified. Indeed, a simulation study and empirical example of adolescent substance use show that, even when data are generated from / assumed to be a 2-parameter logistic IRT, using parsimonious models in small samples leads to more powerful tests of DIF while adequately controlling for Type I error. We also provide evidence for minimum sample sizes needed to detect DIF, and we evaluate whether applying corrections for multiple testing is advisable. Finally, we provide recommendations for applied researchers who conduct DIF analyses in small samples.  相似文献   
568.
Personality development research heavily relies on the comparison of scale means across age. This approach implicitly assumes that the scales are strictly measurement invariant across age. We questioned this assumption by examining whether appropriate personality indicators change over the lifespan. Moreover, we identified which types of items (e.g. dispositions, behaviours, and interests) are particularly prone to age effects. We reanalyzed the German Revised NEO Personality Inventory normative sample (N = 11,724) and applied a genetic algorithm to select short scales that yield acceptable model fit and reliability across locally weighted samples ranging from 16 to 66 years of age. We then examined how the item selection changes across age points and item types. Emotion‐type items seemed to be interchangeable and generally applicable to people of all ages. Specific interests, attitudes, and social effect items—most prevalent within the domains of Extraversion, Agreeableness, and Openness—seemed to be more prone to measurement variations over age. A large proportion of items were systematically discarded by the item‐selection procedure, indicating that, independent of age, many items are problematic measures of the underlying traits. The implications for personality assessment and personality development research are discussed. © 2019 European Association of Personality Psychology  相似文献   
569.
The current study examined the relationship between test-taker cognition and psychometric item properties in multiple-selection multiple-choice and grid items. In a study with content-equivalent mathematics items in alternative item formats, adult participants’ tendency to respond to an item was affected by the presence of a grid and variations of answer options. The results of an item response theory analysis were consistent with the hypothesized cognitive processes in alternative item formats. The findings suggest that seemingly subtle variations of item design could substantially affect test-taker cognition and psychometric outcomes, emphasizing the need for investigating item format effects at a fine-grained level.  相似文献   
570.
The standard error (SE) stopping rule, which terminates a computer adaptive test (CAT) when the SE is less than a threshold, is effective when there are informative questions for all trait levels. However, in domains such as patient-reported outcomes, the items in a bank might all target one end of the trait continuum (e.g., negative symptoms), and the bank may lack depth for many individuals. In such cases, the predicted standard error reduction (PSER) stopping rule will stop the CAT even if the SE threshold has not been reached and can avoid administering excessive questions that provide little additional information. By tuning the parameters of the PSER algorithm, a practitioner can specify a desired tradeoff between accuracy and efficiency. Using simulated data for the Patient-Reported Outcomes Measurement Information System Anxiety and Physical Function banks, we demonstrate that these parameters can substantially impact CAT performance. When the parameters were optimally tuned, the PSER stopping rule was found to outperform the SE stopping rule overall, particularly for individuals not targeted by the bank, and presented roughly the same number of items across the trait continuum. Therefore, the PSER stopping rule provides an effective method for balancing the precision and efficiency of a CAT.  相似文献   
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号