[MUD-Dev] Basic principles of reinforcement for muds

Sun Jul 23 14:49:58 CEST 2000

	It is a famous truism that if the only tool you have is a hammer,
everything begins to look like a nail.  As an experimental psychologist by
day, I tend to see the various ways muds can be set up in terms of
reinforcement contingencies.  It occurs to me that some of the findings
from experimental psych can be used to help design better, more effective
muds.  The following is a sample of the kind of thing I'm talking about.
If you find this interesting and/or helpful, please tell me so I'll know to
keep posting similar stuff.

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

	A schedule of reinforcement is a rule or set of rules governing when the
participant receives their rewards.  Regardless of your opinion of
behaviorism in general, one of Skinner's great contributions to psychology
was his discovery that different patterns of rewards produce different
patterns of behavior.  The anecdote here is that one day he ran low on the
food pellets he was using to reward his rats so he stopped giving them a
pellet after every press and began rewarding them every tenth press or
every 30 seconds.  The results of this were so striking that an entire new
area of psychology was born, which just happens to have some strong
implications for mud design.

	While usually you hear about these sorts of experiments being done with
rats or pigeons, all of the following results have been demonstrated with
humans, most commonly in the form of simple video games.

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

Fixed Ratio (FR)

	A fixed ratio schedule is one which produces reinforcement every Nth
response.  For example a rat getting rewarded for every 30th lever-press
would be an FR schedule, as would a player who required 1000 xp to advance
a level.  (The classic D&D/Diku method of handling levels)

	When presented with an FR schedule, participants generally pause for a
while after being rewarded before responding again.  After all, that first
press is never going to be rewarded, is it?  But once they start
responding, they generally respond at a fast rate until they're rewarded.
The length of the pause is generally proportional to the size of the ratio,
with larger ratios having longer pauses.

	Mud Application:  If players have a set number of xp they need to
accumulate to go up a level, they're going to pause for some while after
gaining a level.  If you know it's going to be a long time before the next
reward, there's a very weak incentive to start on gaining xp.  This may
lead to them logging on less during that pause, then going on an intensive
leveling spree.

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

Fixed Interval (FI)

	A fixed interval schedule is one that produces reinforcement a set period
of time after the last one.  For example, a mob that repops an hour after
it's killed would be a fixed interval schedule.  The subject still has to
do something to get the reward (kill the mob, press the lever, etc) but
technically only once.

	On an FI schedule, participants again generally pause after reward but
then start gradually responding more and more as the scheduled reward
approaches.  If they're not rewarded when they thought they would be,
they'll continue responding for a while and then gradually trail off.

	In a mud, the above repop example would result in the player going off and
doing other things for a while after killing the mob and then checking in
to see if it's repopped at a gradually increasing rate, such that at the
time he expects it to repop, he's staying at that spot.

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

Variable Ratio (VR)

	In a variable ratio schedule, the participant must respond some number of
times to be rewarded, but the specific number required varies.  For
example, a rat might be rewarded on average every 10th lever-press, but
sometimes sooner, sometimes later.  The subject never knows exactly how
many responses are required this time.

	Participants in a VR schedule respond at a rapid consistent rate.  Because
there's always a chance that the next action gives you a reward, there's
always a reason to be trying.  There is no pausing such as is found in the
previous two methods.

	In a mud, this schedule would be something like a system where the amount
of xp for the next level was randomly generated and not known by the
player.  The mean amount required to advance to level 2 would be
consistent, but it varies from player to player.  This way, the player
always has a reason to be playing.  There's always a chance that the next
thing he/she does will bring a reward.

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

Variable Interval (VI)

	In a variable interval schedule, the amount of time until the next reward
varies from trial.  For example, a rat might be reinforced an average of 30
seconds after the last reward.  The participant doesn't know exactly how
long it will take this time.

	Participants in a VI schedule respond at a moderate consistent rate,
similar to the VR schedule but slower.  There is no pausing as in FR and FI.

	In a mud, a system where a mob repopped an average of two hours after it
was killed would be a VI schedule.  Players would check the spot at a
consistent rate, every few minutes or so.

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

Future possible topics:

	Addiction, a how to guide.
	Conditioned reinforcers, how to get players to do things for free.
	How players allocate their time between possible activities.
	Flocks, herds, and dragon-slaying posses.

	If you have an idea for a specific topic you'd like to know about, email me.

_______________________________________________
MUD-Dev mailing list
MUD-Dev at kanga.nu
http://www.kanga.nu/lists/listinfo/mud-dev