Abstract: | Eighteen pigeons served in a discrete-trials short-term memory experiment in which the reinforcement probability for a peck on one of two keys depended on the response reinforced on the previous trial: either the probability of reinforcement on a trial was 0.8 for the same response reinforced on the previous trial and was 0.2 for the other response (Group A), or, it was 0 or 0.2 for the same response and 1.0 or 0.8 for the other response (Group B). A correction procedure ensured that over all trials reinforcement was distributed equally across the left and right keys. The optimal strategy was either a winstay, lose-shift strategy (Group A) or a win-shift, lose-stay strategy (Group B). The retention interval, that is the intertrial interval, was varied. The average probability of choosing the optimal alternative reinforced 80% of the time was 0.96, 0.84, and 0.74 after delays of 2.5, 4.0, and 6.0 sec, respectively for Group A, and was 0.87, 0.81, and 0.55 after delays of 2.5, 4.0, and 6.0 sec, respectively, for Group B. This outcome is consistent with the view that behavior approximated the optimal response strategy but only to an extent permitted by a subject's short-term memory for the cue correlated with reinforcement, that is, its own most-recently reinforced response. More generally, this result is consistent with “molecular” analyses of operant behavior, but is inconsistent with traditional “molar” analyses holding that fundamental controlling relations may be discovered by routinely averaging over different local reinforcement contingencies. In the present experiment, the molar results were byproducts of local reinforcement contingencies involving an organism's own recent behavior. |