Abstract: | Pigeons were trained on simultaneous red-green discrimination procedures with delayed reward and sequences of stimuli during the delay. In Experiment 1, three stimuli appeared during the 60-second intervals between the correct responses and reward, and the incorrect responses and nonreward. The stimulus that immediately followed a correct response also preceded nonreward, and the stimulus that followed an incorrect response preceded reward. These stimuli were 10 or .33 second in duration for different groups. Stimuli during the remainder of the delay interval differed following correct and incorrect responses. Group 10 initially persisted in the nonrewarded choice, but shifted to a preponderance of rewarded responses after further training. Group .33 rapidly acquired the correct response. Similar results were obtained in Experiment 2 where delay intervals consisted of opposite sequences of two stimuli of equal duration and total delays were 6, 20, or 60 seconds. Early in training, generalization of differential conditioned-reinforcing properties from the conditions preceding reward and nonreward to postchoice conditions had a greater effect relative to backchaining than it did later. It was concluded that delayed-reward learning is best analyzed in terms of the conditioned-reinforcing value of the patterns of cues that follow immediately after rewarded and nonrewarded responses. |