Abstract: | Listeners infer which object in a visual scene a speaker refers to from the systematic variation of the speaker's tone of voice (ToV). We examined whether ToV also guides word learning. During exposure, participants heard novel adjectives (e.g., “daxen”) spoken with a ToV representing hot, cold, strong, weak, big, or small while viewing picture pairs representing the meaning of the adjective and its antonym (e.g., elephant–ant for big–small). Eye fixations were recorded to monitor referent detection and learning. During test, participants heard the adjectives spoken with a neutral ToV, while selecting referents from familiar and unfamiliar picture pairs. Participants were able to learn the adjectives' meanings, and, even in the absence of informative ToV, generalize them to new referents. A second experiment addressed whether ToV provides sufficient information to infer the adjectival meaning or needs to operate within a referential context providing information about the relevant semantic dimension. Participants who saw printed versions of the novel words during exposure performed at chance during test. ToV, in conjunction with the referential context, thus serves as a cue to word meaning. ToV establishes relations between labels and referents for listeners to exploit in word learning. |