[MUD-Dev] Conditioned reinforcers: Getting players to do things for free

John Hopson jwh9 at acpub.duke.edu
Fri Sep 15 12:12:30 CEST 2000


	This is the next installment in my ongoing series of articles about
applying experimental psychology to game design.

Previous articles:

	http://www.kanga.nu/archives/MUD-Dev-L/2000Q3/msg00364.php
	http://www.kanga.nu/archives/MUD-Dev-L/2000Q3/msg00725.php


If you have any ideas for topics you'd like me to cover, drop me an email.
If you're finding these articles useful or have any questions, I'd be happy
to hear from you.



-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-


	One of the problems with using food rewards in animal experiments is the
subjects get full.  Sooner or later, the rat's eaten 900 little food
pellets and decides he's going to sleep.  One solution psychologists have
uses to keep their subjects working is conditioned reinforcers, virtual
rewards that derive their value from the normal reinforcement.

	The essence of this idea is that stimuli associated with reinforcers
eventually become reinforcers themselves.  If you show the rat a green
light every time he gets a pellet, eventually he will work just to see the
green light.  Of course, he still prefers the actual pellets, but the light
is so strongly associated with reward that it becomes rewarding in itself.
A neutral stimulus becomes linked in the mind to something of intrinsic
value, giving it a derived value.

	As a solution to the psychologists' problem, conditioned reinforcers are a
godsend.  When the rat meets the conditions for a reinforcer, half the time
it receives a food reinforcer and a conditioned reinforcer.  The other half
the time, it only receives a conditioned reinforcer.  The conditioned
reinforcers tell the animal "yes, you did the right thing" without giving
it an actual reward.  If we just didn't provide anything, the subject would
not know if it had made a mistake or if there was ever another pellet
coming in the future.  But the conditioned reinforcer provides
encouragement without real reward.  Satiation ("being full") is a function
of real reward, so the addition of conditioned reinforcers stretches the
amount of work the subject's willing to do.

	Now, you might think humans are too smart for this, but there are a
thousand examples in our daily life.  Think of the person quitting
cigarettes who chews on a pen, deriving comfort from the act even in the
absence of nicotine.  Or a corporation providing titles in lieu of raises.

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

Conditioned reinforcers, like most of our fantasy magics, have rules that
govern their use.


**Rule #1**  The conditioned reinforcer has to occur every time the real
reward occurs.

You're forging a link between a neutral thing and the reward.  If the
reward occurs on its own, you've made the two things distinct items that
just happen together sometimes.



**Rule #2**  The more the conditioned reinforcer occurs without the reward,
the weaker it gets.

The connection can be diluted from the other end too.  Of course,
conditioned reinforcers have to happen without the reward in order to be of
any use, but the more you do it the weaker they get.


-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

Some examples of conditioned reinforcers in muds:

Titles:  Knight, Dread lord, Supreme Ninja, it's easy to create titles that
players can earn as they advance in power.  As long as gains in power also
include gains in title, titles can serve as a reward on their own.

Distinctive equipment:  Because everyone starts at low levels, the low
level equipment is common.  As the character advances, you can gain other
equipment.  The harder the equipment is to get, the rarer it will be.  As a
result, unusual equipment becomes valuable in and of itself.  

Killing: If you're playing a game in which advancement occurs by killing
things, then killing things becomes a conditioned reinforcer.  This may
explain why players will slaughter hundreds of weak things that they know
won't contribute to their leveling; the death of a mob has become a reward
because rewards always happen after the death of a mob.

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

Conditioned reinforcers and Cheating

	An interesting series of experiments has explored the connection between
conditioned reinforcers and cheating.  If subjects are required to work
hard to obtain a reward, the hard work begins to serve as a conditioned
reinforcer.  (This is sometimes referred to as "Learned industriousness")
This led researchers to the strange idea that subjects reinforced for a
difficult task would be _less_ likely to cheat when offered the chance than
subjects rewarded for an easy task.  This was tested experimentally as
follows:

	The subjects were college students.  Half of them were asked to do an easy
task (adding pairs of 2 digit numbers) and half were required to do a hard
task (adding pairs of 7 digit numbers), and all of them were praised for
getting the right answers.  Next, all subjects were asked to do a nearly
impossible task, solving difficult anagrams in only 5 seconds.   After
being presented with a scrambled series of letters, the subjects had 5
seconds to decide what the original word was.  At the end of five seconds,
the experimenter told the subject what the answer was and asked them if
they had gotten the right answer.  If they said they had, the subject was
again rewarded.

	Now, the ability to cheat in this second task is obvious.  There's no way
for the experimenter to know if the subject had really worked out the right
answer.  The subject could lie and get rewarded without fear of penalty.
The remarkable finding here was that the students who had previously been
asked to do a hard task cheated less than those who had been asked to do
the easy task.

	This experimental finding has some interesting implications for mud
design.  Typically, games make the first level or so easy, in order to not
to turn off players with too high a learning curve.  However, this
experiment suggests that there may be a trade-off here.  If you start
people out easy, you may be training them to expect easy rewards and to
feel they _deserve_ easy rewards.  This could breed resentment and a strong
desire to outwit the unfair system.







_______________________________________________
MUD-Dev mailing list
MUD-Dev at kanga.nu
https://www.kanga.nu/lists/listinfo/mud-dev



More information about the mud-dev-archive mailing list