首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Speech Perception Within an Auditory Cognitive Science Framework   总被引:1,自引:0,他引:1  
ABSTRACT— The complexities of the acoustic speech signal pose many significant challenges for listeners. Although perceiving speech begins with auditory processing, investigation of speech perception has progressed mostly independently of study of the auditory system. Nevertheless, a growing body of evidence demonstrates that cross-fertilization between the two areas of research can be productive. We briefly describe research bridging the study of general auditory processing and speech perception, showing that the latter is constrained and influenced by operating characteristics of the auditory system and that our understanding of the processes involved in speech perception is enhanced by study within a more general framework. The disconnect between the two areas of research has stunted the development of a truly interdisciplinary science, but there is an opportunity for great strides in understanding with the development of an integrated field of auditory cognitive science.  相似文献   

2.
We hypothesize that a cognitive analysis based on the construction-integration theory of comprehension (Kintsch, 1988) can predict what is difficult about generating complex composite commands in the UNIX operating system. We provide empirical support for assumptions of the Doane, Kintsch, and Polson (1989, 1990) construction-integration model for generating complex commands in UNIX. We asked users whose UNIX experience varied to produce complex UNIX commands, and then provided help prompts whenever the commands that they produced were erroneous. The help prompts were designed to assist subjects with respect to both the knowledge and the memory processes that our UNIX modeling efforts have suggested are lacking in less expert users. It appears that experts respond to different prompts than do novices. Expert performance is helped by the presentation of abstract information, whereas novice and intermediate performance is modified by presentation of concrete information. Second, while presentation of specific prompts helps less expert subjects, they do not provide sufficient information to obtain correct performance. Our analyses suggest that information about the ordering of commands is required to help the less expert with both knowledge and memory load problems in a manner consistent with skill acquisition theories.  相似文献   

3.
The Yale experimental programming system is described. This system consists of user-callable subroutines and a set of modifications to the UNIX operating system which support the development and execution of real-time psychology experiments. The interactions between a timesharing operating system and the provision of real-time facilities for experiments are discussed.  相似文献   

4.
We describe the development of a new system for categorizing thought disorder. In the development phase (Study 1), we examined the degree to which speech samples and definitions of thought disorder subtypes taken from: (1) the Scale for the Assessment of Thought, Language, and Communication (TLC); (2) the Though Disorder Index (TDI); and (3) the Assessment of Bizarre-Idiosyncratic Thinking (BIT), reflected disturbances in form versus disturbances in content. Ratings were provided by naive judges, experienced clinicians, and linguistic experts. The results contributed to the development of a new system dividing thought disorder into disturbances in (1) fluency, (2) discourse coherence, (3) content, and (4) social convention. In the validation phase (Study 2), 21 schizophrenic and 19 manic subjects were interviewed, interpreted proverbs, and responded to Rorschach cards. Subjects' speech was rated using the TLC, TDI, and BIT. We also measured hallucinations, delusions, and digit span performance. The results of Study 2 provided evidence supporting the validity of our new categorization system.  相似文献   

5.
Although there is considerable variability, most individuals with Down syndrome have mental retardation and speech and language deficits, particularly in language production and syntax and poor speech intelligibility. This article describes research findings in the language and communication development of individuals with Down syndrome, first briefly describing the physical and cognitive phenotype of Down syndrome, and two communication related domains-hearing and oral motor skills. Next, we describe language development in Down syndrome, focusing on communication behaviors in the prelinguistic period, then the development of language in children and adolescents, and finally language development in adults and the aging period. We describe language development in individuals with Down syndrome across four domains: phonology, semantics, syntax, and pragmatics. Wethen suggest strategies for intervention and directions for research relating to individuals with Down syndrome.  相似文献   

6.
The development in the interface of smart devices has lead to voice interactive systems. An additional step in this direction is to enable the devices to recognize the speaker. But this is a challenging task because the interaction involves short duration speech utterances. The traditional Gaussian mixture models (GMM) based systems have achieved satisfactory results for speaker recognition only when the speech lengths are sufficiently long. The current state-of-the-art method utilizes i-vector based approach using a GMM based universal background model (GMM-UBM). It prepares an i-vector speaker model from a speaker’s enrollment data and uses it to recognize any new test speech. In this work, we propose a multi-model i-vector system for short speech lengths. We use an open database THUYG-20 for the analysis and development of short speech speaker verification and identification system. By using an optimum set of mel-frequency cepstrum coefficients (MFCC) based features we are able to achieve an equal error rate (EER) of 3.21% as compared to the previous benchmark score of EER 4.01% on the THUYG-20 database. Experiments are conducted for speech lengths as short as 0.25 s and the results are presented. The proposed method shows improvement as compared to the current i-vector based approach for shorter speech lengths. We are able to achieve improvement of around 28% even for 0.25 s speech samples. We also prepared and tested the proposed approach on our own database with 2500 speech recordings in English language consisting of actual short speech commands used in any voice interactive system.  相似文献   

7.
The perceptual system for speech is highly organized from early infancy. This organization bootstraps young human learners’ ability to acquire their native speech and language from speech input. Here, we review behavioral and neuroimaging evidence that perceptual systems beyond the auditory modality are also specialized for speech in infancy, and that motor and sensorimotor systems can influence speech perception even in infants too young to produce speech-like vocalizations. These investigations complement existing literature on infant vocal development and on the interplay between speech perception and production systems in adults. We conclude that a multimodal speech and language network is present before speech-like vocalizations emerge.  相似文献   

8.
We describe a general-purpose, programmable system that provides high-quality, low-cost devices for experimentation in psychoacoustics and speech perception. The system is controlled by a host computer (e.g., an IBM PC), over a serial line. Through the use of a high-level, general-purpose experiment control program, the designed interconnection of devices can be specified logically, and the settings of the devices modified dynamically, during the experiment.  相似文献   

9.
J L Miller  P W Jusczyk 《Cognition》1989,33(1-2):111-137
One of the most highly developed human abilities is communication by speech. Throughout the years, research on speech perception has demonstrated that humans are well adapted to extract highly encoded linguistic information from the speech signal. The sophisticated nature of these capacities and their early appearance during development suggest the existence of a rich biological substrate for speech perception. In the present paper, we describe some of these important capacities and examine research from different domains that may help illuminate the nature of their biological foundations.  相似文献   

10.
EVE, theEarly VisionEmulation software, is a set of computer programs designed to compute models of early visual processing. EVE may be used with a wide variety of models concerning spatial detection and discrimination, motion analysis, and issues of spatial sampling. EVE is modular and flexible. It runs under the UNIX operating system, and is device-independent. We describe the implementation of the EVE software and discuss how it may be applied to several visual models.  相似文献   

11.
《Ecological Psychology》2013,25(4):333-382
In this article, we attempt to reconcile the linguistic hypothesis that speech involves an underlying sequencing of abstract, discrete, context-independent units, with the empirical observation of continuous, context-dependent interleaving of articulatory movements. To this end, we first review a previously proposed task-dynamic model for the coordination and control of the speech articulators. We then describe an extension of this model in which invariant speech units (gestural primitives) are identified with context-independent sets of parameters in a dynamical system having two functionally distinct but interacting levels. The intergestural level is defined according to a set of activation coordinates; the interarticulator level is defined according to both model articulator and tractvariable coordinates. In the framework of this extended model, coproduction effects in speech are described in terms of the blending dynamics defined among a set of temporally overlapping active units; the relative timing of speech gestures is formulated in terms of the serial dynamics that shape the temporal patterning of onsets and offsets in unit activations. Implications of this approach for certain phonological issues are discussed, and a range of relevant experimental data on speech and limb motor control is reviewed.  相似文献   

12.
We present a comprehension‐based computational model of UNIX user skill acquisition and performance in a training context (UNICOM). The work extends a comprehension‐based theory of planning to account for skill acquisition and learning. Individual models of 22 UNIX users were constructed and used to simulate user performance on successive command production problems in a training context. Comparisons of model and the human empirical data result in a high degree of agreement, validating the ability of UNICOM to predict user response to training.  相似文献   

13.
Perception of visual speech and the influence of visual speech on auditory speech perception is affected by the orientation of a talker's face, but the nature of the visual information underlying this effect has yet to be established. Here, we examine the contributions of visually coarse (configural) and fine (featural) facial movement information to inversion effects in the perception of visual and audiovisual speech. We describe two experiments in which we disrupted perception of fine facial detail by decreasing spatial frequency (blurring) and disrupted perception of coarse configural information by facial inversion. For normal, unblurred talking faces, facial inversion had no influence on visual speech identification or on the effects of congruent or incongruent visual speech movements on perception of auditory speech. However, for blurred faces, facial inversion reduced identification of unimodal visual speech and effects of visual speech on perception of congruent and incongruent auditory speech. These effects were more pronounced for words whose appearance may be defined by fine featural detail. Implications for the nature of inversion effects in visual and audiovisual speech are discussed.  相似文献   

14.
15.
A theory of speech monitoring, proposed by Levelt (1983), assumes that the quality of one's speech is checked by the speech comprehension system. This system inspects one's own overt speech but would also inspect an inner speech plan ("the inner loop"). We have elaborated and tested this theory by way of formalizing it as a computational model. This model includes a new proposal concerning the timing relation between planning the interruption and the repair: the proposal that these two processes are performed in parallel. We attempted to simulate empirical data about the distribution of error-to-cutoff and cutoff-to-repair intervals and the effect of speech rate on these intervals (these intervals are shorter with faster speech). The main questions were (1) Is an inner monitor that utilizes the speech perception system fast enough to simulate the timing data? (2) Can the model account for the effects of speech rate on these intervals? We conclude that including an inner loop through the speech comprehension system generates predictions that fit the empirical data. The effects of speed can be accounted for, given our proposal about the time course of planning interruption and repair. A novel prediction is that the error-to-cutoff interval decreases with increasing position in the phrase.  相似文献   

16.
Sources of variability in children’s language growth   总被引:1,自引:0,他引:1  
The present longitudinal study examines the role of caregiver speech in language development, especially syntactic development, using 47 parent–child pairs of diverse SES background from 14 to 46 months. We assess the diversity (variety) of words and syntactic structures produced by caregivers and children. We use lagged correlations to examine language growth and its relation to caregiver speech. Results show substantial individual differences among children, and indicate that diversity of earlier caregiver speech significantly predicts corresponding diversity in later child speech. For vocabulary, earlier child speech also predicts later caregiver speech, suggesting mutual influence. However, for syntax, earlier child speech does not significantly predict later caregiver speech, suggesting a causal flow from caregiver to child. Finally, demographic factors, notably SES, are related to language growth, and are, at least partially, mediated by differences in caregiver speech, showing the pervasive influence of caregiver speech on language growth.  相似文献   

17.
One solution to the difficulty of running real-time applications under UNIX is to develop the application programs on an available UNIX system and execute them on a dedicated satellite processor. This combines the advantages of a powerful timesharing operating system with the real-time capabilities of a single-process system. PARASITE, a real-time satellite system, provides tools for developing the application program on the host and executing it on a satellite. A host utility serves to invoke the standard UNIX C compiler and link its output with the PARASITE library. The PARASITE library consists of routines that mimic the standard library and routines that read and write the real-time peripherals. PARASITE currently supports digital inputs and outputs, asynchronous serial-line interfaces, program-mable real-time clocks, and analog-to-digital and digital-to-analog converters. Another PARA-SITE utility downloads the object code of the user program and PARASITE support code into the satellite, where it runs independently. Once the application routine is executing on the satellite, it controls the satellite processor at all times and is continuously available for servicing hardware-generated interrupts. When a real-time peripheral interrupt routine is invoked by a hardware interrupt, it sends a software signal to the user program, in addition to processing the interrupt. This allows the user program to perform additional tasks that are specific to the application. All data to be permanently stored must be transferred to the host. Since the satellite has no direct access to the resources of the host, a process running on the host receives the data and manages files. PARASITE provides packet driver routines on both the host and the satellite, which together handle the data transmission protocol.  相似文献   

18.
We describe an account of lexically guided tuning of speech perception based on interactive processing and Hebbian learning. Interactive feedback provides lexical information to prelexical levels, and Hebbian learning uses that information to retune the mapping from auditory input to prelexical representations of speech. Simulations of an extension of the TRACE model of speech perception are presented that demonstrate the efficacy of this mechanism. Further simulations show that acoustic similarity can account for the patterns of speaker generalization. This account addresses the role of lexical information in guiding both perception and learning with a single set of principles of information propagation.  相似文献   

19.
We compared the development of spontaneous private speech and its relationship to self-controlled behavior in a sample of 6- to 12-year-olds with attention-deficit hyperactivity disorder (ADHD) and matched normal controls. Thirty-eight boys were observed in their classrooms while engaged in math seatwork. Results revealed that ADHD children were delayed in private speech development in that they engaged in more externalized, self-guiding and less inaudible, internalized speech than normal youngsters. Several findings suggest that the developmental lag was a consequence of a highly unmanageable attentional system that prevents ADHD children's private speech from gaining efficient mastery over behavior. First, selfguiding speech was associated with greater attentional focus only among the least distractible ADHD boys. Second, the most mature, internalized speech forms were correlated with self-stimulating behavior for ADHD subjects but not for controls. Third, observations of ADHD children both on and off stimulant medication indicated that reducing their symptoms substantially increased the maturity of private speech and its association with motor quiescence and attention to task. Results suggest that the Vygotskian hypothesis of a unidirectional path of influence from private speech to self-controlled behavior should be expanded into a bidirectional model. These findings may also shed light on why treatment programs that train children with attentional deficits in speechto-self have shown limited efficacy.Preparation of this article was supported in part by National Institute of Mental Health Grant HD22354-01 and a grant from the Graduate School, Illinois State University, to Laura E. Berk. We gratefully acknowledge the assistance of Douglas Hopper, Christine Mitchell, Mary Ann Snyder, Kathleen Szeminska, Deborah Petrillo, and Eric Zehr in collecting the data. We are also grateful to Benjamin Moore, Clinical Director of The Baby Fold, and Sarah Booth, Vice Principal of Metcalf School, Normal, Illinois, for facilitating the research and to the teachers and children for welcoming us into their classrooms.  相似文献   

20.
In this paper, we describe the application of new computer and speech synthesis technologies for reading instruction. Stories are presented on the computer screen, and readers may designate words or parts of words that they cannot read for immediate speech feedback. The important contingency between speech sounds and their corresponding letter patterns is emphasized by displaying the letter patterns in reverse video as they are spoken. Speech feedback is provided by an advanced text-to-speech synthesizer (DECtalk). Intelligibility data are presented, showing that DECtalk can be understood almost as well as natural human speech by both normal adults and reading disabled children. Preliminary data from 26 disabled readers indicate that there are significant benefits of speech feedback for reading comprehension and word recognition, and that children enjoy reading with the system.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号