首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Hill-climbing by pigeons   总被引:12,自引:12,他引:0       下载免费PDF全文
Pigeons were exposed to two types of concurrent operant-reinforcement schedules in order to determine what choice rules determine behavior on these schedules. In the first set of experiments, concurrent variable-interval, variable-interval schedules, key-peck responses to either of two alternative schedules produced food reinforcement after a random time interval. The frequency of food-reinforcement availability for the two schedules was varied over different ranges for different birds. In the second series of experiments, concurrent variable-ratio, variable-interval schedules, key-peck responses to one schedule produced food reinforcement after a random time interval, whereas food reinforcement occurred for an alternative schedule only after a random number of responses. Results from both experiments showed that pigeons consistently follow a behavioral strategy in which the alternative schedule chosen at any time is the one which offers the highest momentary reinforcement probability (momentary maximizing). The quality of momentary maximizing was somewhat higher and more consistent when both alternative reinforcement schedules were time-based than when one schedule was time-based and the alternative response-count based. Previous attempts to provide evidence for the existence of momentary maximizing were shown to be based upon faulty assumptions about the behavior implied by momentary maximizing and resultant inappropriate measures of behavior.  相似文献   

2.
The correlation between short-term retention of the outcome of the preceding response and overall learning proficiency was investigated for serial reversal learning. Pigeons were trained to asymptote on a serial reversal problem and then were presented a percentage reinforcement schedule where only some correct trials were rewarded. Nonrewarded correct trials were treated exactly as incorrect trials. The difference in error probability following the two types of correct trials was then used as a measure of short-term retention. When intertrial intervals (ITI) were short (6 sec), substantial differences occurred. When the ITI's were increased, the difference in accuracy declined regularly to no difference at an ITI of 60 sec. This demonstration of a short-term retention gradient, coupled with the finding that overall reversal learning was much better with the shorter ITI's, suggests that a primary mechanism of improvement in serial reversal learning is the acquisition of a conditional discrimination based on the outcome of the preceding response.  相似文献   

3.
Two probabilistic schedules of reinforcement, one richer in reinforcement, the other leaner, were overlapping stimuli to be discriminated in a choice situation. One of two schedules was in effect for 12 seconds. Then, during a 6-second choice period, the first left-key peck was reinforced if the richer schedule had been in effect, and the first right-key peck was reinforced if the leaner schedule had been in effect. The two schedule stimuli may be viewed as two binomial distributions of the number of reinforcement opportunities. Each schedule yielded different frequencies of 16 substimuli. Each substimulus had a particular type of outcome pattern for the 12 seconds during which a schedule was in effect, and consisted of four consecutive light-cued 3-second T-cycles, each having 0 or 1 reinforced center-key pecks. Substimuli therefore contained 0 to 4 reinforcers. On any 3-second cycle, the first center-key peck darkened that key and was reinforced with probability .75 or .25 in the richer or leaner schedules, respectively. In terms of the theory of signal detection, detectability neared the maximum possible d′ for all four pigeons. Left-key peck probability increased when number of reinforcers in a substimulus increased, when these occurred closer to choice, or when pellets were larger for correct left-key pecks than for correct right-key pecks. Averaged over different temporal patterns of reinforcement in a substimulus, substimuli with the same number of reinforcers produced choice probabilities that matched relative expected payoff rather than maximized one alternative.  相似文献   

4.
Stimulus generalization and the response-reinforcement contingency   总被引:3,自引:3,他引:0       下载免费PDF全文
Generalization gradients along a line-tilt continuum were obtained from groups of pigeons that had been trained to peck a key on different schedules of reinforcement. In Exp. I, gradients following training on a differential-reinforcement-of-low-rate (DRL) schedule proved to be much flatter than gradients following the usual 1-min variable interval (VI) training. In Exp. II, the value of the VI schedule itself was parametrically studied; Ss trained on long VI schedules (e.g., 4-min) produced much flatter gradients than Ss trained on short VI schedules (30-sec; 1-min). The results were interpreted mainly in terms of the relative control exerted by internal, proprioceptive cues on the different reinforcement schedules. Several implications of the results for other problems in the field of stimulus generalization are discussed.  相似文献   

5.
A systematic sequence of prompt and probe trials was used to teach picture names to three severely retarded children. On prompt trials the experimenter presented a picture and said the picture name for the child to imitate; on probe trials the experimenter did not name the picture. A procedure whereby correct responses to prompts and probes were nondifferentially reinforced was compared with procedures whereby correct responses to prompts and probes were differentially reinforced according to separate and independent schedules of primary reinforcement. In Phase 1, correct responses to prompts and probes were reinforced nondifferentially on a fixed ratio (FR) 6 or 8 schedule; in Phase 2, correct responses to prompts were reinforced on the FR schedule and correct responses to probes were reinforced on an FR schedule of the same value; in Phase 3, correct responses to prompts were reinforced on the FR schedule and correct responses to probes were reinforced on a continuous reinforcement (CRF; every correct response reinforced) schedule; in Phase 4, correct responses to prompts were reinforced on a CRF schedule and correct responses to probes were reinforced on the FR schedule; in Phase 5, a reversal to the conditions of Phase 3 was conducted. For all three children, the FR schedule for correct responses to prompts combined with the CRF schedule for correct responses to probes (Phases 3 and 5) generated the highest number of correct responses to probes, the highest accuracy (correct responses relative to correct responses plus errors) on probe trials, and the highest rate of learning to name pictures.  相似文献   

6.
The effects of several different schedules of primary reinforcement were compared in a picture-naming task with retarded children. In Experiment I, number of correct responses and learning rate were higher under fixed-ratio schedules than under continuous reinforcement. In Experiment II, number of correct responses and learning rate tended to be greater under intermediate than under low or high fixed-ratio schedules. In Experiment III, number of correct responses was higher under interlocking schedules, in which the response requirement increased with time following the previous reinforcement, than under comparable fixed-ratio schedules. Learning rates were generally low and, perhaps because of this, not very different under the two types of schedules in this experiment. Accuracy (i.e., proportion of trials on which correct responses occurred) was typically high and insensitive to variations in schedule and schedule parameter throughout each experiment.  相似文献   

7.
Context, observing behavior, and conditioned reinforcement   总被引:4,自引:4,他引:0       下载免费PDF全文
Pigeons made observing responses for stimuli signalling either a fixed-interval 30-sec schedule or a fixed-ratio x schedule, where x was either 20, 30, 100, 140, or 200 and the schedules alternated at random after reinforcement. If observing responses did not occur, food-producing responses occurred to a stimulus common to both reinforcement schedules. When the fixed-interval schedule was paired with a low-value fixed ratio, i.e., 20 or 30, the presentation of the stimulus reliably signalling the fixed-ratio schedule reinforced observing behavior, but the presentation of the stimulus reliably signalling the fixed-interval schedule did not. The converse was the case when the fixed-interval schedule was paired with a large-valued fixed ratio, i.e., 100, 140, or 200. The results demonstrated that the occasional presentation of the stimulus signalling the shorter interreinforcement interval was necessary for the maintenance of observing behavior. The reinforcement relationship was a function of the schedule context and was reversed by changing the context. Taken together, the results show that the establishment and measurement of conditioned reinforcement is dependent upon the context or environment in which stimuli reliably correlated with differential events occur.  相似文献   

8.
Rats were trained on a discrete-trial procedure in which one alternative (VR) was correlated with a constant probability of reinforcement while the other was correlated with a VI schedule which ran during the intertrial intervals and held the scheduled reinforcer until they were obtained by the next VI response. Relative reinforcement rate was varied in series of conditions in which the VR schedule was varied and in series in which the VI was varied. Choice behavior was described well by the generalized matching law, although moderate undermatching occurred for all subjects. Contrary to the predictions of molar maximizing (optimality) theories, there was no consistent bias in favor of the ratio alternative, and the sensitivity to reinforcement allocation was not systematically affected by whether the ratio or interval schedule was varied. The results were also contrary to momentary maximizing accounts, as there was no correspondence between the probability of a changeover to the VI behavior and the time since the last response to the VI alternative. Neither variety of maximizing theory appears to provide a general explanation of matching in concurrent schedules.  相似文献   

9.
In Experiment 1, Japanese monkeys were trained on three conditional position-discrimination problems with colors as the conditional cues. Within each session, each problem was presented for two blocks of ten reinforcements; correct responses were reinforced under continuous-reinforcement, fixed-ratio 5, and variable-ratio 5 schedules, each assigned to one of the three problems. The assignment of schedules to problems was rotated a total of three times (15 sessions per assignment) after 30 sessions of acquisition training. Accuracy of discrimination increased to a moderate level with fewer trials under CRF than under ratio schedules. In contrast, the two ratio schedules, fixed and variable, were more effective in maintaining accurate discrimination than was CRF. With further training, as asymptotes were reached, accuracy was less affected by the schedule differences. These results demonstrated an interaction between the effects of reinforcement schedules and the level of acquisition. In Experiment 2, ratio sizes were gradually increased to 30. Discrimination accuracy was maintained until the ratio reached 20; ratio 30 strained the performance. Under FR conditions, accuracy increased as correct choice responses cumulated after reinforcement.  相似文献   

10.
It has been suggested that the failure to maximize reinforcement on concurrent variable-interval, variable-ratio schedules may be misleading. Inasmuch as response costs are not directly measured, it is possible that subjects are optimally balancing the benefits of reinforcement against the costs of responding. To evaluate this hypothesis, pigeons were tested in a procedure in which interval and ratio schedules had equal response costs. On a concurrent variable time (VT), variable ratio-time (VRT) schedule, the VT schedule runs throughout the session and the VRT schedule is controlled by responses to a changeover key that switches from one schedule to the other. Reinforcement is presented independent of response. This schedule retains the essential features of concurrent VI VR, but eliminates differential response costs for the two alternatives. It therefore also eliminates at least one significant ambiguity about the reinforcement maximizing performance. Pigeons did not maximize rate of reinforcement on this procedure. Instead, their times spent on the alternative schedules matched the relative rates of reinforcement, even when schedule parameters were such that matching earned the lowest possible overall rate of reinforcement. It was further shown that the observed matching was not a procedural artifact arising from the constraints built into the schedule.  相似文献   

11.
Variable reinforcement schedules are used to arrange the availability of reinforcement following varying response ratios or intervals of time. Random reinforcement schedules are subtypes of variable reinforcement schedules that can be used to arrange the availability of reinforcement at a constant probability across number of responses or time. Generating schedule values for variable and random reinforcement schedules can be difficult. The present article describes the steps necessary to write macros in Microsoft Excel that will generate variable-ratio, variable-interval, variable-time, random-ratio, random-interval, and random-time reinforcement schedule values.  相似文献   

12.
Third-grade boys classified as either cognitively impulsive or reflective were reinforced for key pressing according to a DRL (differential reinforcement of low rates) 6-sec schedule of reinforcement. Half of each group received instructions about the behavioral requirements for obtaining reinforcements. Prior to DRL training, impulsive Ss showed a low probability of key press responding at long interresponse time (IRT) intervals while reflective Ss exhibited an equal probability of terminating either short or long IRTs. During training and in the absence of instructions, impulsives exhibited a less precise temporal discrimination, characterized by a greater predominance of response bursts (0–2 sec IRTs) following reinforcements, than reflective Ss. While impulsive and reflective Ss displayed similar frequencies of collateral behavior between successively reinforced responses, impulsives engaged in the reinforced response more frequently and tended (p < .08) to obtain fewer reinforcements. Instructions served to enhance the DRL performance.  相似文献   

13.
Pigeons responded on concurrent variable-interval 180-sec variable-interval 36-sec schedules during Conditions 1 and 3 of Experiment 1. Condition 2 arranged variable-interval 60-sec schedules for both response alternatives. The schedule assigned to the alternative that was associated with the variable-interval 36-sec schedule in Conditions 1 and 3 operated only when the subject responded on that alternative. The proportion of time spent responding on the alternative with the conventional variable-interval 60-sec schedule increased during Condition 2, but exclusive choice of that alternative did not develop. This result is inconsistent with maximization of the overall reinforcement rate and with maximization of the momentary probability of reinforcement (momentary maximizing). Increasing time proportions were also found in Experiment 2, which arranged similar conditions, except that reinforcement was provided on a variable-time basis. The time proportions were close to the momentary maximizing prediction in Experiment 2. The results of both experiments can be explained if it is assumed that time allocation is controlled by delayed reinforcement of changeovers between alternatives.  相似文献   

14.
High-speed photography of key pecking revealed that the arc described by the upper bill as a pigeon closes its beak is capable of operating a Lehigh Valley pigeon key set at 8 to 14 g. Arc-produced switch closure follows initial switch closure in less than 50 msec. When birds were trained on ratio schedules, the probability of interresponse times (IRTs) shorter than 50 msec exceeded 0.30. Interval-trained birds produced a much lower probability of short-IRTs. When the schedules were reversed, there was only weak evidence of a reversal in the probability of short IRTs. A temporal analysis of topographic features observed in the original photographs failed to reveal differences between ratio and interval pecking topography. It appeared that only the point of contact with the key differed between subjects trained on the two schedules. It was concluded that only the locus, but not the topography, of the food-reinforced key peck was modified by the schedule of reinforcement.  相似文献   

15.
Of 23 pigeons, 11 received continuous reinforcement for key pecking, and 12 received an FR 10 schedule of reinforcement. The birds were then tested without food, but with potential conditioned reinforcers presented either on the same schedule as in training, on the other schedule, or not at all. Each bird in the subgroup trained on CRF and tested with Sr's at FR 10 not only gave more responses in testing than did each bird in both subgroups receiving no Sr's, but also gave more responses than did each bird in the Sr subgroup receiving CRF training and Sr's at CRF. Cumulative records are presented to show the effects of different schedules of conditioned reinforcers.  相似文献   

16.
This study assessed effectiveness of group interpersonal skills training conducted in a natural setting with nonanalogue clients. Subjects (Ss) in a behavioral-training condition received 4 hr of instruction consisting of modeling, behavioral rehearsal, coaching, feedback and reinforcement. Training focused on positive and negative social responses and on initiating interactions, as well as reacting to interactions initiated by others. Subjects in a discussion-control condition engaged in focused discussion of interpersonal concerns but received no experiential practice. Within a pre-test-post-test control group design, subjective and objective measures were used to assess training effects. When compared to Ss involved in group discussion. Ss participating in group behavioral training revealed greater pre- to post-test changes on selfreported probability of engaging in selected interpersonal responses and on objective measures of eye contact, speech duration, positive affective responses, use of no-statements, compliance, refusals and requests for new behavior. Support for generalization of training is presented and methodological issues are discussed.  相似文献   

17.
Matching theory describes a process by which organisms distribute their behavior between two or more concurrent schedules of reinforcement (Herrnstein, 1961). In an attempt to determine the generality of matching theory to applied settings, 2 students receiving special education were provided with academic response alternatives. Using a combined simultaneous treatments design and reversal design, unequal ratio schedules of reinforcement were varied across two academic responses. Findings indicated that both subjects allocated higher rates of responses to the richer schedule of reinforcement, although only one responded exclusively to the richer schedule. The present results lend support to a postulation that positive reinforcement may have undesirable collateral effects that are predicted by matching theory (Balsam & Bondy, 1983).  相似文献   

18.
Three groups of rats pressed a lever for milk reinforcers on various simple reinforcement schedules (one schedule per condition). In Group M, each pair of conditions included a mixed-ratio schedule and a fixed-ratio schedule with equal average response:reinforcer ratios. On mixed-ratio schedules, reinforcement occurred with equal probability after a small or a large response requirement was met. In Group R, fixed-ratio and random-ratio schedules were compared in each pair of conditions. For all subjects in these two groups, the frequency distributions of interresponse times of less than one second were very similar on all ratio schedules, exhibiting a peak at about .2 seconds. For comparison, subjects in Group V responded on variable-interval schedules, and few interresponse times as short as .2 seconds were recorded. The results suggest that the rate of continuous responding is the same on all ratio schedules, and what varies among ratio schedules is the frequency, location, and duration of pauses. Preratio pauses were longer on fixed-ratio schedules than on mixed-ratio or random-ratio schedules, but there was more within-ratio pausing on mixed-ratio and random-ratio schedules. Across a single trial, the probability of an interruption in responding decreased on fixed-ratio schedules, was roughly constant on random-ratio schedules, and often increased and then decreased on mixed-ratio schedules. These response patterns provided partial support for Mazur's (1982) theory that the probability of instrumental responding is directly related to the probability of reinforcement and the proximity of reinforcement.  相似文献   

19.
Reinforcement of least-frequent sequences of choices   总被引:3,自引:3,他引:0       下载免费PDF全文
When a pigeon's choices between two keys are probabilistically reinforced, as in discrete trial probability learning procedures and in concurrent variable-interval schedules, the bird tends to maximize, or to choose the alternative with the higher probability of reinforcement. In concurrent variable-interval schedules, steady-state matching, which is an approximate equality between the relative frequency of a response and the relative frequency of reinforcement of that response, has previously been obtained only as a consequence of maximizing. In the present experiment, maximizing was impossible. A choice of one of two keys was reinforced only if it formed, together with the three preceding choices, the sequence of four successive choices that had occurred least often. This sequence was determined by a Bernoulli-trials process with parameter p. Each of three pigeons matched when p was ½ or ¼. Therefore, steady-state matching by individual birds is not always a consequence of maximizing. Choice probability varied between successive reinforcements, and sequential statistics revealed dependencies which were adequately described by a Bernoulli-trials process with p depending on the time since the preceding reinforcement.  相似文献   

20.
The effects of elaboration structure (Sentence, Semantic paragraph, Syntactic paragraph) and list length (8, 12, 16 pairs) on paired-associate learning were investigated in a 3 × 3 factorial design. Seventy-five educable retardates were tested on acquisition (S-R) and reversal (R-S) tasks. Significant acquisition differences were found in the 8-pair list, where Semantic paragraph Ss performed better than Sentence Ss. In the longer lists, all structures were equally effective in facilitating acquisition (mean first trial correct = 60%), as well as reversal (mean correct = 95%). Sentence form (declarative, imperative, interrogative) was controlled in each elaboration structure. Analyses indicated that significantly fewer acquisition errors were made on pairs presented in declarative and imperative, as opposed to interrogative, elaborations. Tests of recall for the elaborations revealed that Ss in all conditions generally recalled them as declarative sentences. Further observations at 24 pairs confirmed the 12- and 16-pair findings.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号