Can humans get arbitrarily capable reinforcement learning (RL) agents to do their bidding? Or will sufficiently capable RL agents always find ways to bypass their intended objectives by shortcutting their reward signal? This question impacts how far RL can be scaled, and whether alternative paradigms must be developed in order to build safe artificial general intelligence. In this paper, we study when an RL agent has an instrumental goal to tamper with its reward process, and describe design principles that prevent instrumental goals for two different types of reward tampering (reward function tampering and RF-input tampering). Combined, the design principles can prevent reward tampering from being an instrumental goal. The analysis benefits from causal influence diagrams to provide intuitive yet precise formalizations.
This paper describes narcissism as our natural, infant-like behaviour, and specifies three characteristics relating to the evacuation and avoidance of distress, to control, and to distancing from relationship. It suggests that the processes of incarnation and individuation represent the development of the early ego and the ‘resolution’ of these narcissistic ways of being, including escape from the ‘mirror trap’. The development of the early, controlling ‘homunculus-ego’ entails the loosening and broadening of ego-identifications and the ego becoming subject to the Self. This is an embodied-relational-social-spiritual process outlined in detail by the full 20 woodcuts of the Rosarium Philosophorum, and specifically the lunar (relational) and solar (self-expressive) paths, which are explored herein. These processes are illustrated with respect to the interaction between infant and caregiver, clinical vignettes and examples from the political sphere. 相似文献
We are grateful to John Jost for carefully engaging with our work and presenting a different interpretation of our findings on the effects of fear and anger stemming from the November 13, 2015, Paris attacks on the propensity to vote for the far right. Jost advances a model that holds that anger mediates the effect of fear on support for the far right. In this rejoinder, we respond to the issues he raises regarding our model specification, consider his alternative suggestion, and offer some conclusions about how to resolve this debate empirically. We hope this exchange advances the literature on the impact of various societal threats on voting for the far right. 相似文献
Speech unfolds over time, and the cues for even a single phoneme are rarely available simultaneously. Consequently, to recognize a single phoneme, listeners must integrate material over several hundred milliseconds. Prior work contrasts two accounts: (a) a memory buffer account in which listeners accumulate auditory information in memory and only access higher level representations (i.e., lexical representations) when sufficient information has arrived; and (b) an immediate integration scheme in which lexical representations can be partially activated on the basis of early cues and then updated when more information arises. These studies have uniformly shown evidence for immediate integration for a variety of phonetic distinctions. We attempted to extend this to fricatives, a class of speech sounds which requires not only temporal integration of asynchronous cues (the frication, followed by the formant transitions 150–350 ms later), but also integration across different frequency bands and compensation for contextual factors like coarticulation. Eye movements in the visual world paradigm showed clear evidence for a memory buffer. Results were replicated in five experiments, ruling out methodological factors and tying the release of the buffer to the onset of the vowel. These findings support a general auditory account for speech by suggesting that the acoustic nature of particular speech sounds may have large effects on how they are processed. It also has major implications for theories of auditory and speech perception by raising the possibility of an encapsulated memory buffer in early auditory processing. 相似文献
In this longitudinal, qualitative case study, 21 clinical and counseling psychology trainees met in leaderless peer supervision groups for 1 training year to discuss multicultural aspects of their clinical work. Peer supervision sessions were audio recorded and transcribed, and the content was analyzed using thematic analysis. Results indicated that, despite the absence of experts to facilitate discussions, participants were able to focus on multicultural issues and generally benefited from this type of peer supervision. En este estudio de caso longitudinal y cualitativo, 21 alumnos de psicología clínica y consejería se reunieron en grupos de supervisión entre pares sin líderes durante un año de su formación para discutir aspectos multiculturales de su trabajo clínico. Se grabó y transcribió el audio de las sesiones de supervisión entre pares, y luego se analizó el contenido usando un análisis temático. Los resultados indicaron que, a pesar de la ausencia de expertos para facilitar las discusiones, los participantes fueron capaces de centrarse en los temas multiculturales y generalmente se beneficiaron de este tipo de supervisión entre pares. 相似文献
Is the structure of lexical representations universal, or do languages vary in the fundamental ways in which they represent lexical information? Here, we consider a touchstone case: whether Semitic languages require a special morpheme, the consonantal root. In so doing, we explore a well-known constraint on the location of identical consonants that has often been used as motivation for root representations in Semitic languages: Identical consonants frequently occur at the end of putative roots (e.g., skk), but rarely occur in their beginning (e.g., ssk). Although this restriction has traditionally been stated over roots, an alternative account could be stated over stems, a representational entity that is found more widely across the world's languages. To test this possibility, we investigate the acceptability of a single set of roots, manifesting identity initially, finally or not at all (e.g., ssk versus skk versus rmk) across two nominal paradigms: CéCeC (a paradigm in which identical consonants are rare) and CiCúC (a paradigm in which identical consonants are frequent). If Semitic lexical representations consist of roots only, then similar restrictions on consonant co-occurrence should be observed in the two paradigms. Conversely, if speakers store stems, then the restriction on consonant co-occurrence might be modulated by the properties of the nominal paradigm (be it by means of statistical properties or their grammatical sources). Findings from rating and lexical decision experiments with both visual and auditory stimuli support the stem hypothesis: compared to controls (e.g., rmk), forms with identical consonants (e.g., ssk, skk) are less acceptable in the CéCeC than in the CiCúC paradigm. Although our results do not falsify root-based accounts, they strongly raise the possibility that stems could account for the observed restriction on consonantal identity. As such, our results raise fresh challenge to the notion that different languages require distinct sets of representational resources. 相似文献
For nearly 30 years researchers have investigated how bodyweight affects evaluative workplace outcomes, such as hiring decisions and performance appraisals. Despite this, no meta-analytic review has been undertaken to quantify the negative impact that bodyweight has on such outcomes. The results of this meta-analytic study suggest that in relation to non-overweight individuals in the workplace, overweight individuals may be disadvantaged across evaluative workplace outcomes (d = −.52). Further, differences in magnitude of the effects of weight-based bias were found for hiring (d = −.70) and performance (d = −.23) outcomes. 相似文献