Reinforcement

 

Learning Outcomes

1. What is Thorndike’s Law of Effect?

2. What is operant behaviour? What is the difference between reinforcement and a reinforcer?

3. Describe positive and negative reinforcement. Explain the difference between escape and avoidance behaviour.

4. What are the differences between social vs. automatic, natural vs. programmed, and tangible vs. sensory vs. activity reinforcement?

5. How does the Premack principle work?

6. What is the difference between unconditioned and conditioned reinforcers?

7. How do reward value, motivating operations, timing, and contingency affect reinforcement?

8. Explain the difference between continuous and intermittent reinforcement.

9. What are the characteristics of the four schedules of intermittent reinforcement?

 


 

Research Focus

 

How can ___________ behaviour be increased by reinforcement?

(Wahler et al., 1965)

- Danny, a 6-year-old boy, was uncooperative with his parents

- __________ behaviours: decided for himself what foods to eat, when his parents would play with him, and when he would go to bed

- responded to punishment by shouting and crying

- behavioural treatment program:

• sessions conducted in observation room

• commanding behaviours (e.g., “Now we’ll play this”) ignored by mother

• ___________ behaviours (e.g., “Do you want to sit here?”) supported and responded to by mother

- results: cooperative behaviours increased (also, commanding behaviours decreased)

Wahler et al. (1965)

 


 

Reinforcement

 

E. L. Thorndike (1911):

- placed hungry cat into an escapable “______ ___,” with a plate of fish outside the box

- cat could eventually open the box, using trial and error

- behaviours became quicker over time

- ___ __ ______: behaviour followed by pleasant consequences is more likely to occur again in that situation

 

Operant Behaviour

 

- definition of _______: “functioning or tending to produce effects: effective; of or relating to the observable or measurable” (Merriam-Webster, 2013)

- that is, a behaviour that operates on the environment (Skinner, 1937)

- is “_______” or “evoked” (not “elicited” by a stimulus, as in respondent conditioning)

- operant (or ____________) conditioning entails manipulating consequences of behaviour

• the consequences may increase or decrease a behaviour

• the consequence of a behaviour can itself be a stimulus or event that leads to further behaviour

e.g., eat salty snack → get thirsty → drink sugary pop

• consequences occur immediately after a behaviour

 

_____________: the process in which the consequence of a behaviour strengthens the behaviour

- behaviour is more likely to occur in the future (frequency), or occurs more quickly (latency), etc.

- __________: a stimulus, object, or event that strengthens a behaviour; often is an appetitive stimulus (characterized by a natural desire to satisfy bodily needs)

e.g., after a dog follows your command, you give it a treat: reinforcement is giving the treat to increase the behaviour; the reinforcer is the treat itself

 

Types of reinforcement:

________ reinforcement: a situation in which a behaviour is followed by the presentation of an appetitive (pleasant) stimulus that increases the behaviour

(Note: textbook calls an appetitive stimulus a “positive reinforcer” or simply “reinforcer.”)

e.g., I tell a joke → you _____; this makes me more likely to tell more jokes in the future

 

________ reinforcement: a situation in which a behaviour is followed by the removal of an aversive (unpleasant) stimulus that increases the behaviour

e.g., putting up an umbrella → stops cold rain falling on you; this makes you more likely to use an umbrella in the future when it’s raining

 

Subtypes of negative reinforcement:

______ behaviour: causes removal of existing aversive stimulus

e.g., when you feel cold, you put on a sweater

_________ behaviour: prevents presentation of aversive stimulus

e.g., before you go outside, you put on a sweater

Escape and avoidance are seldom used therapeutically, because the client experiences an aversive situation.

 


 

Forms of Reinforcement

 

_______ reinforcement: occurs spontaneously as part of everyday life

e.g., your friend laughs when you tell a joke

 

__________ reinforcement: planned and systematic; given as part of a behavioural treatment

e.g., giving yourself rewards as part of a self-management program

 

______ reinforcement: involves another person to deliver reinforcing consequences

e.g., teacher praises a student for completing her homework

 

_________ reinforcement: the individual gets reinforcing consequences directly from the environment, independent of the actions of other people

e.g., you scratch an itch to make it go away

 

________ (or material) reinforcement: access to a preferred object (includes consumable reinforcement)

e.g., getting toys, stickers, or snacks after good behaviour

 

sensory (or interoceptive) reinforcement: pleasant sensory stimulation

e.g., listening to music, or tactile stimulation

 

________ reinforcement: engaging in a preferred behaviour after doing a non-preferred behaviour

e.g., the pomodoro technique: set a timer for 25 minutes and do work, then take a 5-minute break and do something fun; repeat (Cirillo, 2018)

 

_______ principle (Premack, 1959): a high-probability behaviour can serve as positive reinforcement for performing a low-probability behaviour, thus increasing it

e.g., after studying for an hour, you play video games for an hour--making it more likely you’ll study again in the future

 

How can you apply behaviour to increase exercise?

(Milkman et al., 2014)

- participants: 226 students, faculty, and staff with a university gym membership, who indicated that they wanted to work out more

- independent variable:

• full treatment condition: chose 4 audiobooks from a set of 82 best-sellers (including The Hunger Games trilogy, Da Vinci Code novels, Twilight series), loaded on an ____ _______ accessed only at the gym

• control condition: received $25 Barnes & Noble bookstore gift cards, but were encouraged to work out more

- results:

• overall, gym attendance ________ once fall term began (0.00 is baseline level):

Milkman et al. (2013)

• statistically significant difference in average number of gym visits between full treatment and control conditions

• (precipitous ____-___ in gym attendance after Thanksgiving break when gym was closed)

- participants asked how much they would pay for an audiobook available only at the gym:

• 61% would pay $1 or more

• 32% would pay $10 or more

• 10% would pay $20 or more

- authors’ suggestion: “_______” for movies available to you only at the gym

- “__________ ________”: making a more desirable behaviour (e.g., listening to a favourite podcast, getting a pedicure) contingent on performing a less desirable behaviour (e.g., washing dishes, spending time with a difficult relative)

(strictly speaking, temptation bundling is not activity reinforcement, but synchronous reinforcement; e.g., see Diaz de Villegas et al., 2020)

 


 

Kinds of Reinforcers

 

_____________ (or primary) reinforcer: stimulus or event that has natural reinforcing effects (i.e., not due to prior conditioning or learning); may enhance survival

e.g., food, water, absence of pain

 

___________ (or secondary) reinforcer: previously neutral stimulus that has become associated with an unconditioned reinforcer

e.g., money is a generalized conditioned reinforcer which can be used to obtain almost any other primary reinforcer

e.g., in _____ reinforcement, tokens can be exchanged for backup reinforcers (like money, food, or TV time)

e.g., animal clicker training: animals are reinforced with food which is paired with a click sound; eventually the sound becomes reinforcing on its own (Karen Pryor, 1984)

 

e.g., clicker training has also been used successfully with gymnasts, golfers, veterinarians, dancers, football linemen, and pole-vaulters

Levy et al. (2016):

- participants: ________ residents

- tasks: tying the locking, sliding knot, and making a low-angle drill hole

[show 5 pix here]

- independent variable: teaching strategy

1) traditional demonstration approach

2) tasks were taught as behavioural chains; positive reinforcement given via conditioned reinforcer (_______)

- clicker group outperformed traditional group on both tasks

 


 

Factors Influencing Effectiveness of Reinforcement

 

• ______ value: quantity and quality of the reinforcer, and its value to the individual

__________ operations (MOs): antecedent events that can (temporarily) alter the effectiveness of reinforcement, and thus affect behaviour; also called setting events

- types of setting events (Kazdin, 2000):

• social: e.g., presence of an attractive person

• physiological: e.g., having a headache

• environmental: e.g., quiet library for studying

 

- ____________ operation (EO): establishes/increases the effectiveness of reinforcement

e.g., caloric deprivation is EO for food

 

- __________ operation (AO): decreases the effectiveness of reinforcement

e.g., satiation (fullness) is AO for food

 

• timing: reinforcement should occur soon after the behaviour

• ___________: consequences should consistently follow the behaviour

 


 

Schedules of Reinforcement

(Ferster & Skinner, 1957)

- pattern of occurrence of reinforcement following behaviour

 

__________: reinforcement given for each response

- leads to rapid acquisition (performing a new behaviour)

e.g., putting money into a vending machine → getting candy every time

 

____________ (or partial: only some responses are reinforced

- acquisition phase is longer

e.g., asking random strangers out on a date → someone accepts only occasionally

 

_____ _____ (FR): reinforcer given after a set number of responses

- FR 10: every 10 bar-presses → 1 food pellet

- high response rate; brief post-reinforcement pause

e.g., salesperson gets a bonus every time they sell 10 cars

 

________ _____ (VR): reinforcer given after a random number of responses (number deviates around a mean)

- VR 20: on average, every 20 bar-presses → 1 food pellet

- high response rates

e.g., slot machines

 

_____ ________ (FI): reinforcer given when response occurs after a certain length of time

- FI 5": first response after every 5 seconds → 1 food pellet

- responses increase as reinforcement time nears

e.g., checking mailbox behaviour increases as typical delivery time approaches

 

________ ________ (VI): reinforcer given when response occurs after a variable length of time (length deviates around a mean)

- VI 30": first response after an average of 30 seconds since the last reinforcement → 1 food pellet

- slow, steady responding

e.g., paddling out to surf