期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

SayWhen: An automated method for high-accuracy speech onset detection

Jansen PA Watter S 《Behavior research methods》2008,40(3):744-751

Many researchers across many experimental domains utilize the latency of spoken responses as a dependent measure. These measurements are typically made using a voice key, an electronic device that monitors the amplitude of a voice signal, and detects when a predetermined threshold is crossed. Unfortunately, voice keys have been repeatedly shown to be alarmingly errorful and biased in accurately detecting speech onset latencies. We present SayWhen--an easy-to-use software system for offline speech onset latency measurement that (1) automatically detects speech onset latencies with high accuracy, well beyond voice key performance, (2) automatically detects and flags a subset of trials most likely to have mismeasured onsets, for optional manual checking, and (3) implements a graphical user interface that greatly speeds and facilitates the checking and correction of this flagged subset of trials. This automatic-plus-selective-checking method approaches the gold standard performance of full manual coding in a small fraction of the time. 相似文献

2.

When two become one: Temporally dynamic integration of the face and voice

Jonathan B. Freeman Nalini Ambady 《Journal of experimental social psychology》2011,47(1):259-263

In everyday interactions with others, people have to deal with the sight of a face and sound of a voice at the same time. How the perceptual system brings this information together over hundreds of milliseconds to perceive others remains unclear. In 2 studies, we investigated how facial and vocal cues are integrated during real-time social categorization by recording participants' hand movements (via the streaming x, y coordinates of the computer mouse) en route to “male” and “female” responses on the screen. Participants were presented with male and female faces that were accompanied by a same-sex voice morphed to be either sex-typical (e.g., masculinized male voice) or sex-atypical (i.e., feminized male voice). Before settling into ultimate sex categorizations of the face, the simultaneous processing of a sex-atypical voice led the hand to be continuously attracted to the opposite sex-category response across construal. This is evidence that ongoing results from voice perception continuously influence face perception across processing. Thus, social categorization involves dynamic updates of gradual integration of the face and voice. 相似文献

3.

Space aliens and nonwords: stimuli for investigating the learning of novel word-meaning pairs.

Prahlad Gupta John Lipinski Brandon Abbs Po-Han Lin Emrah Aktunc David Ludden Nadine Martin Rochelle Newman 《Behavior research methods, instruments & computers》2004,36(4):599-603

We describe a set of pictorial and auditory stimuli that we have developed for use in word learning tasks in which the participant learns pairings of novel auditory sound patterns (names) with pictorial depictions of novel objects (referents). The pictorial referents are drawings of "space aliens," consisting of images that are variants of 144 different aliens. The auditory names are possible nonwords of English; the stimulus set consists of over 2,500 nonword stimuli recorded in a single voice, with controlled onsets, varying from one to seven syllables in length. The pictorial and nonword stimuli can also serve as independent stimulus sets for purposes other than word learning. The full set of these stimuli may be downloaded from www.psychonomic.org/archive/. 相似文献

4.

When less is heard than meets the ear: change deafness in a telephone conversation

Fenn KM Shintel H Atkins AS Skipper JI Bond VC Nusbaum HC 《Quarterly journal of experimental psychology (2006)》2011,64(7):1442-1456

During a conversation, we hear the sound of the talker as well as the intended message. Traditional models of speech perception posit that acoustic details of a talker's voice are not encoded with the message whereas more recent models propose that talker identity is automatically encoded. When shadowing speech, listeners often fail to detect a change in talker identity. The present study was designed to investigate whether talker changes would be detected when listeners are actively engaged in a normal conversation, and visual information about the speaker is absent. Participants were called on the phone, and during the conversation the experimenter was surreptitiously replaced by another talker. Participants rarely noticed the change. However, when explicitly monitoring for a change, detection increased. Voice memory tests suggested that participants remembered only coarse information about both voices, rather than fine details. This suggests that although listeners are capable of change detection, voice information is not continuously monitored at a fine-grain level of acoustic representation during natural conversation and is not automatically encoded. Conversational expectations may shape the way we direct attention to voice characteristics and perceive differences in voice. 相似文献

5.

When less is heard than meets the ear: Change deafness in a telephone conversation

《Quarterly journal of experimental psychology (2006)》2013,66(7):1442-1456

During a conversation, we hear the sound of the talker as well as the intended message. Traditional models of speech perception posit that acoustic details of a talker's voice are not encoded with the message whereas more recent models propose that talker identity is automatically encoded. When shadowing speech, listeners often fail to detect a change in talker identity. The present study was designed to investigate whether talker changes would be detected when listeners are actively engaged in a normal conversation, and visual information about the speaker is absent. Participants were called on the phone, and during the conversation the experimenter was surreptitiously replaced by another talker. Participants rarely noticed the change. However, when explicitly monitoring for a change, detection increased. Voice memory tests suggested that participants remembered only coarse information about both voices, rather than fine details. This suggests that although listeners are capable of change detection, voice information is not continuously monitored at a fine-grain level of acoustic representation during natural conversation and is not automatically encoded. Conversational expectations may shape the way we direct attention to voice characteristics and perceive differences in voice. 相似文献

6.

The delayed trigger voice key: an improved analogue voice key for psycholinguistic research

Tyler MD Tyler L Burnham DK 《Behavior research methods》2005,37(1):139-147

Many researchers rely on analogue voice keys for psycholinguistic research. However, the triggering of traditional simple threshold voice keys (STVKs) is delayed after response onset, and the delay duration may vary depending on initial phoneme type. The delayed trigger voice key (DTVK), a standalone electronic device that incorporates an additional minimum signal duration parameter, is described and validated in two experiments. In Experiment 1, recorded responses from a nonword naming task were presented to the DTVK and an STVK. As compared with hand-coded reaction times from visual inspection of the waveform, the DTVK was more accurate than the STVK, overall and across initial phoneme type. Rastle and Davis (2002) showed that an STVK more accurately detected an initial [s] when it was followed by a vowel than when followed by a consonant. Participants’ responses from that study were presented to the DTVK in Experiment 2, and accuracy was equivalent for initial [s] in vowel and consonant contexts. Details for the construction of the DTVK are provided. 相似文献

7.

人声加工的神经机制

伍可陈杰李雯婕陈洁佳刘雷刘翠红《心理科学进展》2020,28(5):752-765

人声是人类听觉环境中最熟知和重要的声音, 传递着大量社会相关信息。与视觉人脸加工类似, 大脑对人声也有着特异性加工。研究者使用电生理、脑成像等手段找到了对人声有特异性反应的脑区, 即颞叶人声加工区(TVA), 并发现非人类动物也有类似的特异性加工区域。人声加工主要涉及言语、情绪和身份信息的加工, 分别对应于三条既相互独立又相互作用的神经通路。研究者提出了双通路模型、多阶段模型和整合模型分别对人声的言语、情绪和身份加工进行解释。未来研究需要进一步讨论人声加工的特异性能否由特定声学特征的选择性加工来解释, 并深入探究特殊人群(如自闭症和精神分裂症患者)的人声加工的神经机制。相似文献

8.

Space aliens and nonwords: Stimuli for investigating the learning of novel word-meaning pairs

Prahlad?Gupta Email author John?Lipinski Brandon?Abbs Po-Han?Lin Emrah?Aktunc David?Ludden Nadine?Martin Rochelle?Newman 《Behavior research methods》2004,36(4):599-603

We describe a set of pictorial and auditory stimuli that we have developed for use in word learning tasks in which the participant learns pairings of novel auditory sound patterns (names) with pictorial depictions of novel objects (referents). The pictorial referents are drawings of “space aliens,” consisting of images that are variants of 144 different aliens. The auditory names are possible nonwords of English; the stimulus set consists of over 2,500 nonword stimuli recorded in a single voice, with controlled onsets, varying from one to seven syllables in length. The pictorial and nonword stimuli can also serve as independent stimulus sets for purposes other than word learning. The full set of these stimuli may be downloaded fromwww.psychonomic.org/archive/. 相似文献

9.

Characterizing the motor execution stage of speech production: consonantal effects on delayed naming latency and onset duration

Rastle K Croot KP Harrington JM Coltheart M 《Journal of experimental psychology. Human perception and performance》2005,31(5):1083-1095

The research described in this article had 2 aims: to permit greater precision in the conduct of naming experiments and to contribute to a characterization of the motor execution stage of speech production. The authors report an exhaustive inventory of consonantal and postconsonantal influences on delayed naming latency and onset acoustic duration, derived from a hand-labeled corpus of single-syllable consonant-vowel utterances. Five talkers produced 6 repetitions each of a set of 168 prepared monosyllables, a set that comprised each of the consonantal onsets of English in 3 vowel contexts. Strong and significant effects associated with phonetic characteristics of initial and noninitial phonemes were observed on both delayed naming latency and onset acoustic duration. Results are discussed in terms of the biomechanical properties of the articulatory system that may give rise to these effects and in terms of their methodological implications for naming experiments. 相似文献

10.

Infants’ listening in multitalker environments: Effect of the number of background talkers

Rochelle S. Newman 《Attention, perception & psychophysics》2009,71(4):822-836

Infants are often spoken to in the presence of background sounds, including speech from other talkers. In the present study, we compared 5- and 8.5-month-olds’ abilities to recognize their own names in the context of three different types of background speech: that of a single talker, multitalker babble, and that of a single talker played backward. Infants recognized their names at a 10-dB signal-to-noise ratio in the multiple-voice condition but not in the single-voice (nonreversed) condition, a pattern opposite to that of typical adult performance. Infants similarly failed to recognize their names when the background talker’s voice was reversed—that is, unintelligible, but with speech-like acoustic properties. These data suggest that infants may have difficulty segregating the components of different speech streams when those streams are acoustically too similar. Alternatively, infants’ attention may be drawn to the time-varying acoustic properties associated with a single talker’s speech, causing difficulties when a single talker is the competing sound. 相似文献

11.

From sensory to long-term memory: evidence from auditory memory reactivation studies 总被引：1，自引：0，他引：1

Winkler I Cowan N 《Experimental psychology》2005,52(1):3-20

Everyday experience tells us that some types of auditory sensory information are retained for long periods of time. For example, we are able to recognize friends by their voice alone or identify the source of familiar noises even years after we last heard the sounds. It is thus somewhat surprising that the results of most studies of auditory sensory memory show that acoustic details, such as the pitch of a tone, fade from memory in ca. 10-15 s. One should, therefore, ask (1) what types of acoustic information can be retained for a longer term, (2) what circumstances allow or help the formation of durable memory records for acoustic details, and (3) how such memory records can be accessed. The present review discusses the results of experiments that used a model of auditory recognition, the auditory memory reactivation paradigm. Results obtained with this paradigm suggest that the brain stores features of individual sounds embedded within representations of acoustic regularities that have been detected for the sound patterns and sequences in which the sounds appeared. Thus, sounds closely linked with their auditory context are more likely to be remembered. The representations of acoustic regularities are automatically activated by matching sounds, enabling object recognition. 相似文献

12.

What’s in a voice? Dolphins do not use voice cues for individual recognition

Laela S. Sayigh Randall S. Wells Vincent M. Janik 《Animal cognition》2017,20(6):1067-1079

Most mammals can accomplish acoustic recognition of other individuals by means of “voice cues,” whereby characteristics of the vocal tract render vocalizations of an individual uniquely identifiable. However, sound production in dolphins takes place in gas-filled nasal sacs that are affected by pressure changes, potentially resulting in a lack of reliable voice cues. It is well known that bottlenose dolphins learn to produce individually distinctive signature whistles for individual recognition, but it is not known whether they may also use voice cues. To investigate this question, we played back non-signature whistles to wild dolphins during brief capture-release events in Sarasota Bay, Florida. We hypothesized that non-signature whistles, which have varied contours that can be shared among individuals, would be recognizable to dolphins only if they contained voice cues. Following established methodology used in two previous sets of playback experiments, we found that dolphins did not respond differentially to non-signature whistles of close relatives versus known unrelated individuals. In contrast, our previous studies showed that in an identical context, dolphins reacted strongly to hearing the signature whistle or even a synthetic version of the signature whistle of a close relative. Thus, we conclude that dolphins likely do not use voice cues to identify individuals. The low reliability of voice cues and the need for individual recognition were likely strong selective forces in the evolution of vocal learning in dolphins. 相似文献

13.

On the complexities of measuring naming

Rastle K Davis MH 《Journal of experimental psychology. Human perception and performance》2002,28(2):307-314

The aims of this study were to investigate the adequacy of electronic voice keys for the purpose of measuring naming latency and to test the assumption that voice key error can be controlled by matching conditions on initial phoneme. Three types of naming latency measurements (hand-coding and 2 types of voice keys) were used to investigate effects of onset complexity (e.g., sat vs. spat) on reading aloud (J. R. Frederiksen & J. F. Kroll, 1976; A. H. Kawamoto & C. T. Kello, 1999). The 3 measurement techniques produced the 3 logically possible results: a significant complexity advantage, a significant complexity disadvantage, and a null effect. Analyses of the performance of each voice key are carried out, and implications for studies of naming latency are discussed. 相似文献

14.

Integrated software for analysis and synthesis of voice quality

Kreiman J Antoñanzas-Barroso N Gerratt BR 《Behavior research methods》2010,42(4):1030-1041

Voice quality is an important perceptual cue in many disciplines, but knowledge of its nature is limited by a poor understanding of the relevant psychoacoustics. This article (aimed at researchers studying voice, speech, and vocal behavior) describes the UCLA voice synthesizer, software for voice analysis and synthesis designed to test hypotheses about the relationship between acoustic parameters and voice quality perception. The synthesizer provides experimenters with a useful tool for creating and modeling voice signals. In particular, it offers an integrated approach to voice analysis and synthesis and allows easy, precise, spectral-domain manipulations of the harmonic voice source. The synthesizer operates in near real time, using a parsimonious set of acoustic parameters for the voice source and vocal tract that a user can modify to accurately copy the quality of most normal and pathological voices. The software, user’s manual, and audio files may be downloaded from http:// brm.psychonomic-journals.org/content/supplemental. Future updates may be downloaded from www.surgery .medsch.ucla.edu/glottalaffairs/. 相似文献

15.

Production constraints on learning novel onset phonotactics

Redford MA 《Cognition》2008,107(3):785-816

Three experiments addressed the hypothesis that production factors constrain phonotactic learning in adult English speakers, and that this constraint gives rise to a markedness effect on learning. In Experiment 1, an acoustic measure was used to assess consonant–consonant coarticulation in naturally produced nonwords, which were then used as stimuli in a phonotactic learning experiment. Results indicated that sonority-rising sequences were more coarticulated than -plateauing sequences, and that listeners learned novel-rising onsets more readily than novel-plateauing onsets. Experiments 2 and 3 addressed the specific questions of whether (1) the acoustic correlates of coarticulation or (2) the coarticulatory patterns of self-productions constrained learning. In Experiment 2, stimuli acoustics were altered to control for coarticulatory differences between sequence type, but a clear markedness effect was still observed. In Experiment 3, listeners’ self-productions were gathered and used to predict their treatment of novel-rising and -plateauing sequences. Results were that listeners’ coarticulatory patterns predicted their treatment of novel sequences. Overall, the findings suggest that the powerful effects of statistical learning are moderated by the perception–production loop in language. 相似文献

16.

Lifting the curtain on the Wizard of Oz: biased voice-based impressions of speaker size 总被引：1，自引：0，他引：1

Rendall D Vokey JR Nemeth C 《Journal of experimental psychology. Human perception and performance》2007,33(5):1208-1219

The consistent, but often wrong, impressions people form of the size of unseen speakers are not random but rather point to a consistent misattribution bias, one that the advertising, broadcasting, and entertainment industries also routinely exploit. The authors report 3 experiments examining the perceptual basis of this bias. The results indicate that, under controlled experimental conditions, listeners can make relative size distinctions between male speakers using reliable cues carried in voice formant frequencies (resonant frequencies, or timbre) but that this ability can be perturbed by discordant voice fundamental frequency (F-sub-0, or pitch) differences between speakers. The authors introduce 3 accounts for the perceptual pull that voice F-sub-0 can exert on our routine (mis)attributions of speaker size and consider the role that voice F-sub-0 plays in additional voice-based attributions that may or may not be reliable but that have clear size connotations. 相似文献

17.

The rattling sound of rattlesnakes (Crotalus viridis) as a communicative resource for ground squirrels (Spermophilus beecheyi) and burrowing owls (Athene cunicularia)

Owings DH Rowe MP Rundus AS 《Journal of comparative psychology (Washington, D.C. : 1983)》2002,116(2):197-205

Animal communication involves very dynamic processes that can generate new uses and functions for established communicative activities. In this article, the authors describe how an aposematic signal, the rattling sound of rattlesnakes (Crotalus viridis), has been exploited by 2 ecological associates of rattlesnakes: (a) California ground squirrels (Spermophilus beecheyi) use incidental acoustic cues in rattling sounds to assess the danger posed by the rattling snake, and (b) burrowing owls (Athene cunicularia) defend themselves against mammalian predators by mimicking the sound of rattling. The remarkable similarity between the burrowing owl's defensive hiss and the rattlesnake's rattling reflects both exaptation and adaptation. Such exploitation of the rattling sound has favored alternations in both the structure and the deployment of rattling by rattlesnakes. 相似文献

18.

Does he sound cooperative? Acoustic correlates of cooperativeness

Arnaud Tognetti Valerie Durand Melissa Barkat-Defradas Astrid Hopfensitz 《British journal of psychology (London, England : 1953)》2020,111(4):823-839

The sound of the voice has several acoustic features that influence the perception of how cooperative the speaker is. It remains unknown, however, whether these acoustic features are associated with actual cooperative behaviour. This issue is crucial to disentangle whether inferences of traits from voices are based on stereotypes, or facilitate the detection of cooperative partners. The latter is likely due to the pleiotropic effect that testosterone has on both cooperative behaviours and acoustic features. In the present study, we quantified the cooperativeness of native French-speaking men in a one-shot public good game. We also measured mean fundamental frequency, pitch variations, roughness, and breathiness from spontaneous speech recordings of the same men and collected saliva samples to measure their testosterone levels. Our results showed that men with lower-pitched voices and greater pitch variations were more cooperative. However, testosterone did not influence cooperative behaviours or acoustic features. Our finding provides the first evidence of the acoustic correlates of cooperative behaviour. When considered in combination with the literature on the detection of cooperativeness from faces, the results imply that assessment of cooperative behaviour would be improved by simultaneous consideration of visual and auditory cues. 相似文献

19.

Reading and spontaneous speaking fundamental frequency of young Arabic men for Arabic and English languages: a comparative study

Abu-Al-makarem A Petrosino L 《Perceptual and motor skills》2007,105(2):572-580

Speaking fundamental frequency (SFF), the average fundamental frequency (lowest frequency of a complex periodic sound) measured over the speaking time of a vocal or speech task, is a basic acoustic measure in clinical evaluation and treatment of voice disorders. Currently, there are few data on acoustic characteristics of different sociolinguistic groups, and no published data on the fundamental frequency characteristics of Arabic speech. The purpose of this study was to obtain preliminary data on the SFF characteristics of a group of normal speaking, young Arabic men. 15 native Arabic men (M age = 23.5 yr., SD=2.5) as participants received identical experimental treatment. Four speech samples were collected from each one, Arabic reading, Arabic spontaneous speech, English reading, and English spontaneous speech. Speaking samples, analyzed using the Computerized Speech Lab, showed no significant difference for mean SFF between language and type of speech and none for mean SFF between languages. A significant difference in the mean SFF was found between the types of speech. The SFF used during reading was significantly higher than that for spontaneous speech. Also Arabic men had higher SFF values than those previously reported for young men in other linguistic groups. SFF then might differ among linguistic, dialectical, and social groups and such data may provide clinicians information useful in evaluation and management of voice. 相似文献

20.

Analyzing acoustic interactions in natural bullfrog (Rana catesbeiana) choruses

Megela Simmons A Simmons JA Bates ME 《Journal of comparative psychology (Washington, D.C. : 1983)》2008,122(3):274-282

Analysis of acoustic interactions between animals in active choruses is complex because of the large numbers of individuals present, their high calling rates, and the considerable numbers of vocalizations that either overlap or show close temporal alternation. The authors describe a methodology for recording chorus activity in bullfrogs (Rana catesbeiana) using multiple, closely spaced acoustic sensors that provide simultaneous estimates of sound direction and sound characteristics. This method provides estimates of location of individual callers, even under conditions of call overlap. This is a useful technique for understanding the complexity of the acoustic scene faced by animals vocalizing in groups. 相似文献