首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Theories of probabilistic reinforcement.   总被引:9,自引:8,他引:1  
In three experiments, pigeons chose between two alternatives that differed in the probability of reinforcement and the delay to reinforcement. A peck at a red key led to a delay of 5 s and then a possible reinforcer. A peck at a green key led to an adjusting delay and then a certain reinforcer. This delay was adjusted over trials so as to estimate an indifference point, or a duration at which the two alternatives were chosen about equally often. In Experiments 1 and 2, the intertrial interval was varied across conditions, and these variations had no systematic effects on choice. In Experiment 3, the stimuli that followed a choice of the red key differed across conditions. In some conditions, a red houselight was presented for 5 s after each choice of the red key. In other conditions, the red houselight was present on reinforced trials but not on nonreinforced trials. Subjects exhibited greater preference for the red key in the latter case. The results were used to evaluate four different theories of probabilistic reinforcement. The results were most consistent with the view that the value or effectiveness of a probabilistic reinforcer is determined by the total time per reinforcer spent in the presence of stimuli associated with the probabilistic alternative. According to this view, probabilistic reinforcers are analogous to reinforcers that are delivered after variable delays.  相似文献   

2.
In an adjusting-delay choice procedure, pigeons could peck on either a red key or a green key. A peck on the red key always led to a delay associated with red houselights and then food. The delay was adjusted over trials to estimate an indifference point--a delay at which the two keys were chosen about equally often. In some conditions, a peck on the green key led to food on all trials after delays of either 10 s or 30 s, and green houselights were lit during the delays. In other conditions, food was presented on only half of the green-key trials. If the green houselights continued to occur on both reinforcement and nonreinforcement trials, preference for the green key always decreased. Preference for the green key also decreased if half of the trials had 30-s houselights followed by food and the other half had no green houselights and no food. However, preference for the green key actually increased if half of the trials had 10-s green houselights followed by food and the other half had no green houselights followed by no food. The latter condition therefore demonstrated a case in which preference for an alternative increased when food was removed from half of the trials. The results suggest that the red and green houselights served as conditioned reinforcers. A hyperbolic decay model (Mazur, 1989) provided good predictions for all conditions by assuming that the strength of a conditioned reinforcer is inversely related to the total time spent in its presence before food is delivered.  相似文献   

3.
In a discrete-trials procedure with pigeons, a response on a green key led to a 4-s delay (during which green houselights were lit) and then a reinforcer might or might not be delivered. A response on a red key led to a delay of adjustable duration (during which red houselights were lit) and then a certain reinforcer. The delay was adjusted so as to estimate an indifference point--a duration for which the two alternatives were equally preferred. Once the green key was chosen, a subject had to continue to respond on the green key until a reinforcer was delivered. Each response on the green key, plus the 4-s delay that followed every response, was called one "link" of the green-key schedule. Subjects showed much greater preference for the green key when the number of links before reinforcement was variable (averaging four) than when it was fixed (always exactly four). These findings are consistent with the view that probabilistic reinforcers are analogous to reinforcers delivered after variable delays. When successive links were separated by 4-s or 8-s "interlink intervals" with white houselights, preference for the probabilistic alternative decreased somewhat for 2 subjects but was unaffected for the other 2 subjects. When the interlink intervals had the same green houselights that were present during the 4-s delays, preference for the green key decreased substantially for all subjects. These results provided mixed support for the view that preference for a probabilistic reinforcer is inversely related to the duration of conditioned reinforcers that precede the delivery of food.  相似文献   

4.
In Experiment 1, pigeons' pecks on a green key led to a 5-s delay with green houselights, and then food was delivered on 20% (or, in other conditions, 50%) of the trials. Pecks on a red key led to an adjusting delay with red houselights, and then food was delivered on every trial. The adjusting delay was used to estimate indifference points: delays at which the two alternatives were chosen about equally often. Varying the presence or absence of green houselights during the delays that preceded possible food deliveries had large effects on choice. In contrast, varying the presence of the green or red houselights in the intertrial intervals had no effects on choice. In Experiment 2, pecks on the green key led to delays of either 5 s or 30 s with green houselights, and then food was delivered on 20% of the trials. Varying the duration of the green houselights on nonreinforced trials had no effect on choice. The results suggest that the green houselights served as a conditioned reinforcer at some times but not at others, depending on whether or not there was a possibility that a primary reinforcer might be delivered. Given this interpretation of what constitutes a conditioned reinforcer, most of the results were consistent with the view that the strength of a conditioned reinforcer is inversely related to its duration.  相似文献   

5.
Six pigeons were trained on a delayed red-green matching-to-sample task that arranged four delays within sessions. Matching responses intermittently produced either 1.5-s access to food or 4.5-s access to food, and nonmatching responses produced either 1.5-s or 4.5-s blackout. Two phases were conducted: a signaled phase in which the reinforcer magnitudes (small and large) were signaled by houselights (positioned either on the left or right of the chamber), and an unsignaled phase in which there was no correlation between reinforcer magnitude and houselight position. In both phases, the relative frequency with which red and green matching responses produced food was varied across five values. Both matching accuracy and the sensitivity of performance to the distribution of reinforcers for matching responses decreased with increasing delays in both phases. In addition, accuracy and reinforcer sensitivity were significantly lower on signaled small-reinforcer trials compared with accuracy and sensitivity values on signaled large-reinforcer trials and on both types of unsignaled trials. These results are discussed in the context of research on both nonhuman animal and human memory.  相似文献   

6.
Two experiments examined whether postsample signals of reinforcer probability or magnitude affected the accuracy of delayed matching to sample in pigeons. On each trial, red or green choice responses that matched red or green stimuli seen shortly before a variable retention interval were reinforced with wheat access. In Experiment 1, the reinforcer probability was either 0.2 or 1.0 for both red and green responses. Reinforcer probability was signaled by line or cross symbols that appeared after the sample had been presented. In Experiment 2, all correct responses were reinforced, and the signaled reinforcer durations were 1.0 s and 4.5 s. Matching was more accurate when larger or more probable reinforcers were signaled, independently of retention interval duration. Because signals were presented postsample, the effects were not the result of differential attention to the sample.  相似文献   

7.
The differential-outcomes effect is manifest as more accurate performance of a delayed conditional discrimination when alternative choice responses are followed by different reinforcers than when they are followed by the same reinforcer. In Experiment 1, a differential-outcomes effect was demonstrated within sessions by signaling the duration of food access for correct responses with stimuli appearing in conjunction with the sample stimuli. The delayed matching-to-sample performance of 5 pigeons was more accurate when green choice responses (matching a green sample) were followed by 3.5-s food access and red choice responses (matching a red sample) were followed by 0.5-s food access (different-outcome trials) than when the correct choice responses were both followed by 1.5-s reinforcers (same-outcome trials). In Experiment 2, the acquisition of this differential-outcomes effect was characterized by a progressive decrease in rate of forgetting on different-outcome trials and no change in rate of forgetting on same-outcome trials. In addition, accuracy at the shortest delay intervals for both different-outcome and same-outcome trials increased over acquisition, but to a greater extent for different-outcome trials. These data suggest that both memorial and attentional (time-dependent and time-independent) factors contribute to the differential-outcomes effect.  相似文献   

8.
Four experiments, each with 6 human subjects, varied the distribution of reinforcers for correct responses and the probability of sample-stimulus presentation in symbolic matching-to-sample procedures. Experiment 1 held the sample-stimulus probability constant and varied the ratio of reinforcers obtained for correct responses on the two alternatives across conditions. There was a positive relation between measures of response bias and the ratio of reinforcers. Experiment 2 held the ratio of reinforcers constant and varied the sample-stimulus probability across conditions. Unlike previous studies that used pigeons as subjects, there was a negative relation between bias and the ratio of sample-stimulus presentations. In Experiment 3, the sample-stimulus probability and the reinforcer ratio covaried across conditions. Response bias did not vary systematically across conditions. In Experiments 1 to 3, correct responses were reinforced intermittently. Experiment 4 used the same procedure as Experiment 3, but all correct responses now produced some scheduled consequence. There was a positive relation between response bias and the ratio of reinforcers. The results suggest that human performance in these tasks was controlled by both the relative frequency of reinforced responses and the relative frequency of nonreinforced responses.  相似文献   

9.
Effects of intertrial reinforcers on self-control choice.   总被引:1,自引:1,他引:0       下载免费PDF全文
In three experiments, pigeons chose between a small amount of food delivered after a short delay and a larger amount delivered after a longer delay. A discrete-trial adjusting-delay procedure was used to estimate indifference points--pairs of delay-amount combinations that were chosen about equally often. In Experiment 1, when additional reinforcers were available during intertrial intervals on a variable-interval schedule, preference for the smaller, more immediate reinforcer increased. Experiment 2 found that this shift in preference occurred partly because the variable-interval schedule started sooner after the smaller, more immediate reinforcer, but there was still a small shift in preference when the durations and temporal locations of the variable-interval schedules were identical for both alternatives. Experiment 3 found greater increases in preference for the smaller, more immediate reinforcer with a variable-interval 15-s schedule than with a variable-interval 90-s schedule. The results were generally consistent with a model that states that the impact of any event that follows a choice response declines according to a hyperbolic function with increasing time since the moment of choice.  相似文献   

10.
In Experiment 1 with rats, a left lever press led to a 5-s delay and then a possible reinforcer. A right lever press led to an adjusting delay and then a certain reinforcer. This delay was adjusted over trials to estimate an indifference point, or a delay at which the two alternatives were chosen about equally often. Indifference points increased as the probability of reinforcement for the left lever decreased. In some conditions with a 20% chance of food, a light above the left lever was lit during the 5-s delay on all trials, but in other conditions, the light was only lit on those trials that ended with food. Unlike previous results with pigeons, the presence or absence of the delay light on no-food trials had no effect on the rats' indifference points. In other conditions, the rats showed less preference for the 20% alternative when the time between trials was longer. In Experiment 2 with rats, fixed-interval schedules were used instead of simple delays, and the presence or absence of the fixed-interval requirement on no-food trials had no effect on the indifference points. In Experiment 3 with rats and Experiment 4 with pigeons, the animals chose between a fixed-ratio 8 schedule that led to food on 33% of the trials and an adjusting-ratio schedule with food on 100% of the trials. Surprisingly, the rats showed less preference for the 33% alternative in conditions in which the ratio requirement was omitted on no-food trials. For the pigeons, the presence or absence of the ratio requirement on no-food trials had little effect. The results suggest that there may be differences between rats and pigeons in how they respond in choice situations involving delayed and probabilistic reinforcers.  相似文献   

11.
The contingencies in each alternative of concurrent procedures consist of reinforcement for staying and reinforcement for switching. For the stay contingency, behavior directed at one alternative earns and obtains reinforcers. For the switch contingency, behavior directed at one alternative earns reinforcers but behavior directed at the other alternative obtains them. In Experiment 1, responses on the main lever, in S1, incremented stay and switch schedules and obtained a stay reinforcer when it became available. Responses on the switch lever changed S1 to S2 and obtained switch reinforcers when available. In S2, neither responses on the main lever nor on the switch lever were reinforced, but a switch response changed S2 to S1. Run lengths and visit durations were a function of the ratio of the scheduled probabilities of reinforcement (staying/switching). From run lengths and visit durations, traditional concurrent performance was synthesized, and that synthesized performance was consistent with the generalized matching law. Experiment 2 replicated and extended this analysis to concurrent variable-interval schedules. The synthesized results challenge any theory of matching that requires a comparison among the alternatives.  相似文献   

12.
Sensitivity to reinforcer duration in a self-control procedure   总被引:2,自引:2,他引:0  
In a concurrent-chains procedure, pigeons' responses on left and right keys were followed by reinforcers of different durations at different delays following the choice responses. Three pairs of reinforcer delays were arranged in each session, and reinforcer durations were varied over conditions. In Experiment 1 reinforcer delays were unequal, and in Experiment 2 reinforcer delays were equal. In Experiment 1 preference reversal was demonstrated in that an immediate short reinforcer was chosen more frequently than a longer reinforcer delayed 6 s from the choice, whereas the longer reinforcer was chosen more frequently when delays to both reinforcers were lengthened. In both experiments, choice responding was more sensitive to variations in reinforcer duration at overall longer reinforcer delays than at overall shorter reinforcer delays, independently of whether fixed-interval or variable-interval schedules were arranged in the choice phase. We concluded that preference reversal results from a change in sensitivity of choice responding to ratios of reinforcer duration as the delays to both reinforcers are lengthened.  相似文献   

13.
Choice between single and multiple delayed reinforcers.   总被引:7,自引:5,他引:2       下载免费PDF全文
Pigeons chose between alternatives that differed in the number of reinforcers and in the delay to each reinforcer. A peck on a red key produced the same consequences on every trial within a condition, but between conditions the number of reinforcers varied from one to three and the reinforcer delays varied between 5 s and 30 s. A peck on a green key produced a delay of adjustable duration and then a single reinforcer. The green-key delay was increased or decreased many times per session, depending on a subject's previous choices, which permitted estimation of an indifference point, or a delay at which a subject chose each alternative about equally often. The indifference points decreased systematically with more red-key reinforcers and with shorter red-key delays. The results did not support the suggestion of Moore (1979) that multiple delayed reinforcers have no effect on preference unless they are closely grouped. The results were well described in quantitative detail by a simple model stating that each of a series of reinforcers increases preference, but that a reinforcer's effect is inversely related to its delay. The success of this model, which considers only delay of reinforcement, suggested that the overall rate of reinforcement for each alternative had no effect on choice between those alternatives.  相似文献   

14.
Parallel experiments with rats and pigeons examined reasons for previous findings that in choices with probabilistic delayed reinforcers, rats' choices were affected by the time between trials whereas pigeons' choices were not. In both experiments, the animals chose between a standard alternative and an adjusting alternative. A choice of the standard alternative led to a short delay (1 s or 3 s), and then food might or might not be delivered. If food was not delivered, there was an "interlink interval," and then the animal was forced to continue to select the standard alternative until food was delivered. A choice of the adjusting alternative always led to food after a delay that was systematically increased and decreased over trials to estimate an indifference point--a delay at which the two alternatives were chosen about equally often. Under these conditions, the indifference points for both rats and pigeons increased as the interlink interval increased from 0 s to 20 s, indicating decreased preference for the probabilistic reinforcer with longer time between trials. The indifference points from both rats and pigeons were well described by the hyperbolic-decay model. In the last phase of each experiment, the animals were not forced to continue selecting the standard alternative if food was not delivered. Under these conditions, rats' choices were affected by the time between trials whereas pigeons' choices were not, replicating results of previous studies. The differences between the behavior of rats and pigeons appears to be the result of procedural details, not a fundamental difference in how these two species make choices with probabilistic delayed reinforcers.  相似文献   

15.
Six pigeons were trained in a delayed matching-to-sample task involving bright- and dim-yellow samples on a central key, a five-peck response requirement to either sample, a constant 1.5-s delay, and the presentation of comparison stimuli composed of red on the left key and green on the right key or vice versa. Green-key responses were occasionally reinforced following the dimmer-yellow sample, and red-key responses were occasionally reinforced following the brighter-yellow sample. Reinforcer delivery was controlled such that the distribution of reinforcers across both comparison-stimulus color and comparison-stimulus location could be varied systematically and independently across conditions. Matching accuracy was high throughout. The ratio of left to right side-key responses increased as the ratio of left to right reinforcers increased, the ratio of red to green responses increased as the ratio of red to green reinforcers increased, and there was no interaction between these variables. However, side-key biases were more sensitive to the distribution of reinforcers across key location than were comparison-color biases to the distribution of reinforcers across key color. An extension of Davison and Tustin's (1978) model of DMTS performance fit the data well, but the results were also consistent with an alternative theory of conditional discrimination performance (Jones, 2003) that calls for a conceptually distinct quantitative model.  相似文献   

16.
Two probabilistic schedules of reinforcement, one richer in reinforcement, the other leaner, were overlapping stimuli to be discriminated in a choice situation. One of two schedules was in effect for 12 seconds. Then, during a 6-second choice period, the first left-key peck was reinforced if the richer schedule had been in effect, and the first right-key peck was reinforced if the leaner schedule had been in effect. The two schedule stimuli may be viewed as two binomial distributions of the number of reinforcement opportunities. Each schedule yielded different frequencies of 16 substimuli. Each substimulus had a particular type of outcome pattern for the 12 seconds during which a schedule was in effect, and consisted of four consecutive light-cued 3-second T-cycles, each having 0 or 1 reinforced center-key pecks. Substimuli therefore contained 0 to 4 reinforcers. On any 3-second cycle, the first center-key peck darkened that key and was reinforced with probability .75 or .25 in the richer or leaner schedules, respectively. In terms of the theory of signal detection, detectability neared the maximum possible d′ for all four pigeons. Left-key peck probability increased when number of reinforcers in a substimulus increased, when these occurred closer to choice, or when pellets were larger for correct left-key pecks than for correct right-key pecks. Averaged over different temporal patterns of reinforcement in a substimulus, substimuli with the same number of reinforcers produced choice probabilities that matched relative expected payoff rather than maximized one alternative.  相似文献   

17.
Human subjects were exposed to a concurrent-chains schedule in which reinforcer amounts, delays, or both were varied in the terminal links, and consummatory responses were required to receive points that were later exchangeable for money. Two independent variable-interval 30-s schedules were in effect during the initial links, and delay periods were defined by fixed-time schedules. In Experiment 1, subjects were exposed to three different pairs of reinforcer amounts and delays, and sensitivity to reinforcer amount and delay was determined based on the generalized matching law. The relative responding (choice) of most subjects was more sensitive to reinforcer amount than to reinforcer delay. In Experiment 2, subjects chose between immediate smaller reinforcers and delayed larger reinforcers in five conditions with and without timeout periods that followed a shorter delay, in which reinforcer amounts and delays were combined to make different predictions based on local reinforcement density (i.e., points per delay) or overall reinforcement density (i.e., points per total time). In most conditions, subjects' choices were qualitatively in accord with the predictions from the overall reinforcement density calculated by the ratio of reinforcer amount and total time. Therefore, the overall reinforcement density appears to influence the preference of humans in the present self-control choice situation.  相似文献   

18.
Token reinforcement, choice, and self-control in pigeons.   总被引:9,自引:9,他引:0       下载免费PDF全文
Pigeons were exposed to self-control procedures that involved illumination of light-emitting diodes (LEDs) as a form of token reinforcement. In a discrete-trials arrangement, subjects chose between one and three LEDs; each LED was exchangeable for 2-s access to food during distinct posttrial exchange periods. In Experiment 1, subjects generally preferred the immediate presentation of a single LED over the delayed presentation of three LEDs, but differences in the delay to the exchange period between the two options prevented a clear assessment of the relative influence of LED delay and exchange-period delay as determinants of choice. In Experiment 2, in which delays to the exchange period from either alternative were equal in most conditions, all subjects preferred the delayed three LEDs more often than in Experiment-1. In Experiment 3, subjects preferred the option that resulted in a greater amount of food more often if the choices also produced LEDs than if they did not. In Experiment 4, preference for the delayed three LEDs was obtained when delays to the exchange period were equal, but reversed in favor of an immediate single LED when the latter choice also resulted in quicker access to exchange periods. The overall pattern of results suggests that (a) delay to the exchange period is a more critical determinant of choice than is delay to token presentation; (b) tokens may function as conditioned reinforcers, although their discriminative properties may be responsible for the self-control that occurs under token reinforcer arrangements; and (c) previously reported differences in the self-control choices of humans and pigeons may have resulted at least in part from the procedural conventions of using token reinforcers with human subjects and food reinforcers with pigeon subjects.  相似文献   

19.
In a discrete-trial procedure, pigeons could choose between 2-s and 6-s access to grain by making a single key peck. In Phase 1, the pigeons obtained both reinforcers by responding on fixed-ratio schedules. In Phase 2, they received both reinforcers after simple delays, arranged by fixed-time schedules, during which no responses were required. In Phase 3, the 2-s reinforcer was available through a fixed-time schedule and the 6-s reinforcer was available through a fixed-ratio schedule. In all conditions, the size of the delay or ratio leading to the 6-s reinforcer was systematically increased or decreased several times each session, permitting estimation of an "indifference point," the schedule size at which a subject chose each alternative equally often. By varying the size of the schedule for the 2-s reinforcer across conditions, several such indifference points were obtained from both fixed-time conditions and fixed-ratio conditions. The resulting "indifference curves" from fixed-time conditions and from fixed-ratio conditions were similar in shape, and they suggested that a hyperbolic equation describes the relation between ratio size and reinforcement value as well as the relation between reinforcer delay and its reinforcement value. The results from Phase 3 showed that subjects chose fixed-time schedules over fixed-ratio schedules that generated the same average times between a choice response and reinforcement.  相似文献   

20.
Effects of alternative reinforcement sources: A reevaluation   总被引:3,自引:3,他引:0       下载免费PDF全文
The effects of two alternative sources of food delivery on the key-peck responding of pigeons were examined. Pecking was maintained by a variable-interval 3-min schedule. In the presence of this schedule in different conditions, either a variable-time 3-min schedule delivering food independently of responding or an equivalent schedule that required a minimum 2-s pause between a key peck and food delivery (a differential-reinforcement-of-other-behavior schedule) was added. The differential-reinforcement-of-other-behavior schedule reduced response rates more than did the variable-time schedule in most instances. The delay between a key peck and the next reinforcer consistently was longer under the differential-reinforcement-of-other-behavior schedule than under the variable-time schedule. Response rates and median delay between responses and reinforcers were negatively correlated. These results contradict earlier conclusions about the behavioral effects of alternative reinforcement. They suggest that an interpretation in terms of response–reinforcer contiguity is consistent with the data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号