[MUD-Dev] Metrics for assessing game design [was: When will new MMORPGs that are coming out get original with the gameplay?]
ceo
ceo at grexengine.com
Wed Jul 16 10:07:46 CEST 2003
On Wed, 09 Jul 2003 09:55:00 +0100 ceo <ceo at grexengine.com> wrote:
> J C Lawrence wrote:
>> We have no external objective metrics which can either detect or
>> demonstrate a "good game design" (where good is defined as fun,
>> large enough player base, profitable, sustainable etc) without
>> implementing and thus sustaining the costs of that game.
> I have some, and they *seem* to work well, but so far my attempts
> to get them spread widely, or discussed in detail (e.g. conference
> talks) have met with a complete lack of interest...
Warning: this is very much a work-under-investigation. I'm
particularly interested in examples where it doesn't seem to work (I
haven't found any so far, but there's an awful lot of games to try
testing it on! So thre could easily be thousands...;)). It has been
aimed at general games, not just MUDs.
The following is culled from the documents I'm currently maintaining
on this; in some cases I've glossed over the details (this post is
far too long already!), particulary ommitting analysis of any other
games (which is lengthy and less complete; there's so much to say
when analysing with this framework!).
Hopefully there's enough to be controversial, somehow :).
P.S. About 50% of those who've dismissed this out of hand have
been professional games developers who felt that "of course it's
not possible to rate fun, and never will be; it's something you
feel in your gut, not something you can analyse...[subtext: how
dare you imply that my job as a games designer can be pared down
to a few simple rules; it's much more complex than that]". I only
really mean this as the combination of a metric, plus that the act
of evaluating against it generates guidelines for how you can
improve a game's score.
--------------------------------------
Theory: The aspects that make a game fun are the same as those that make
a game highly suited to AI-optimization via Genetic Programming.
Namely:
1. There are always multiple possible EFFECTIVE strategies
available
2. Every action taken in the direction of a particular strategy
alters the available strategies (usually by making some harder to
achieve, or by making some disappear, or making new ones become
available)
3. There is no "universally optimal" strategy
3. There is no "universally optimal" strategy
3. There is no "universally optimal" strategy
3. There is no "universally optimal" strategy
4. It is usually not trivially obvious which strategy is optimal
at the current moment in time
5. No significantly large set of dis-similar but related
strategies should dead-end in the same situation (closely related
to 2 above) - it's very positive if this almost happens, but then
gets knocked out of that state, either through game-mechanics, or
the quick death of the player (but if so, this situation should
only happen rarely!).
Evaluate a game by asking:
1.To what extent does each player-action cause the available
actions to change?
2.To what extent are MULTIPLE and UNIQUE strategies continuously
available to the player?
3. To what extent do small refinements of your strategy result in
steady, significant gains in your MEASURABLE game achievement?
(if the only gains are immeasurable, the game scores poorly,
because it is not providing enough feedback to the player;
non-quantitive, qualitative things like completing a level DO
count as "measurable" achievements).
count as "measurable" achievements).
count as "measurable" achievements).
count as "measurable" achievements).
Detailed thoughts on each question:
1: Dune 2000 has a good example of this without the word
"effective". It didn't do well as a game, despite seeming to be
excellent at first glance. (See the analysis below).
2: This is an alternative to looking at whether gameplay is
"non-linear". I've seen much debate over whether nonlinearity is
inherently a good or bad thing; the lack of consensus after a
decade or more seems to indicate it was the wrong question to be
asking in the first place. The idSoftware Quake level designer who
was proud of making "simple" challenges in his levels repeatedly
fell afoul of this, and single-handedly put me off it as a
single-player game.
3: There may be "obviously optimal" short-term strategies and
long-term strategies, but none that are both, at least not for the
majority of runs through the game (one of the achievements of
aspect 3 is to provide a game where the player can't just "go on
automatic" to play the game, no matter how experienced they are)
3: There is often a subset of strategies which it is clear
contains the optimal strategy, but without it being clear
precisely which strategy that is. I suspect this should be
promoted to an aspect of it's own - it works as a good corollary
for the others: it encourages players to take actions (to attempt
to work out which of the "good set" is actually the best one),
thereby exercising 2 above. Even better is if one member of the
set is quite poor - but ONLY if that can be quickly deduced once
the player starts pursuing it. Dune 2000, for instance, only
informed you that a given strategy was poor about 10 minutes after
you started producing your troops, when they finally got into
battle and were slaughtered. The presence of the "poor" option is
only a good thing if it adds to the interest of deducing the best
option: it does not have any other relevance to the game (and so
should not cause game-losing situations, like in Dune2000). [*]
3: Usually there *is* a short-term strategy that is obviously the
best - but comes with considerable *obvious* penalization in the
long-term. Some players will adopt it straight away, and discover
the consequences through repeated play. Others will avoid it
initially, because the long-term problems are so obvious. In
multiplayer-play, the tendency is for both types of player to
adopt less extreme behaviour (the former often wipes the floor
with the latter, but later on always ends up losing the game to a
third party; they get so much experience of the end-game that they
are forced to reassess). [*]
5: By "related" I mean that from using one, you are allowed to
switch to one of the others (since in many dangerous situations
your options disappear rapidly!). By "dead-end" I mean "lock you
into a losing situation permanently". It's OK if the situation is
temporary - perhaps greatly weakening your character/troops/etc in
the process. In evolutionary programming, it is fairly obvious
that populations get stuck in evolutionary dead-ends quite
often. The normal tactic there is to monitor the statistics of the
population, and if the variance (usually of their "success" in the
task) over time decreases too much, switch temporarily to more
random selection, causing the population to escape the niche
(although there are many papers on more advanced ways of avoiding
these "local maxima in fitness landscape" [search terms to use on
google, even though the terminology ought to be "fitness range"]).
5: Noah Falstein pointed out that "The value of randomness in
keeping you out of dead-end loops is a well-known technique among
AI programmers. I never thought of it in the sense of how it
resembles random mutation."
5: An example of making the dead-end temporary might be in a
scenario where the enemy gets some advantage whose knock-on effect
causes them to rapidly obliterate your troops. If this scenario
happens often when playing a certain level, or if once all your
troops die, you start again in the same situation as when you
died, then the specific "advantage" that the enemy had should be
tied to the ratio in "numbers of troops" between him and you -
this provides a natural get-out point, enabling the game to carry
on flowing, but still gives the enemy (who managed to manouevre
into the situation) a big reward (he decimated your troops at
little or no cost to his own).
[*]: in particular, this is one of the areas I'm still looking
into; I haven't done enough analysis of how the
presence/abscence/multiplicity of this occurring effects how fun
the game is. I intend to start from the GP perspective, since that
has born everyting else here!, and see if the results from GP
continue to usefully generalize to looking at fun.
Worked examples:
Bomberman / Moonquake / etc:
------
Bomberman scores very on question one; every bomb re-shapes the
level (Which DOES matter because it alters how and where you will
meet your enemy). Even movement scores moderately well, because of
the slow speed of movement. Even choosing which powerup to pick up
scores well, because the fuse on bombs is so slow that you have
time to massively alter the effect of your bomb (by picking up new
powerups) before it goes off - and when you start, the other
player could be "trapped, but safe" but 3 seconds later "trapped,
player could be "trapped, but safe" but 3 seconds later "trapped,
player could be "trapped, but safe" but 3 seconds later "trapped,
player could be "trapped, but safe" but 3 seconds later "trapped,
and about to die".
and about to die".
The framework/metrics above suggest that bomberman could be
improved by adding "powerdowns". These would increase the ability
to alter the available actions: they give you some potential
opportunity to "conserve" powerdowns in case you became trapped by
a combination of your own bombs and other people's. You could then
downgrade your bombs, enabling you to stand closer to your bomb
without being caught in the explosion, and possibly even providing
safe space for you to survive.
The gameplay is split into 3 stages:
stage1: no contact possible between players (no connection
between their areas of the board)
stage2: contact possible, unexploded blocks remain
stage3: no blocks remain, mano a mano duelling
In Stage1, there are typically 5 strategies available:
a. Collect bomb powerups
b. Collect firepower powerups
c. Be as efficient as possible, aiming to get most powerups
possible before game moves to stage2
d. Gain as many powerups as possible whilst sacrificing SOME
efficiency to ensure you lose no lives through accidents
e. (Some versions only): Locate "special" powerups. E.g.,
Moonquake had a "question mark" which had a random effect: it
might kill you, or it might kill your opponent, or reverse your
controls, or give you an extra life, etc. Assuming weightings
that make "certain death" less likely than the chance to kill
your opponent (which was not guaranteed - it required some skill
and timing!), this allows an additional strategy of "reducing
your opponent's lives before the game even makes it to stage2"
It's worth noting that each bomb, if ideally positioned, reveals
potentially three powerups at once. This gives the game a
particularly high score on question 2 - because a single action
often provides many additional powerups.
In addition, because the player can only move relatively slowly
(excepting versions where speed powerups were extreme), when these
multiple powerups appear simultaneously, it is a major decision as
to which should be picked up first. Stage1 often switches to
stage2 with 5-10 uncollected powerups on the board.
As for question 3, the main gains from improved strategy are:
stage1: decreasing your careless mistakes results in losing
fewer lives! NOTE: you have multiple lives per level, so that
slightly more fine feedback is provided than "you died; level
over". The metrics suggest that these multiple lives are a
particular benefit.
stage2: Several opportunities to refine/experiment with the
strategies from stages 1 and 3, most of which run in parallel
during stage2. Price of failure is again mainly shown by loss of
lives, but also from the extremely obvious graphical display of
increasing bomb-power compared to your opponent (both players
will be laying bombs often in almost all strategies, so the
feedback of your comparitive power is quick and fine-grained)
stage3: The only feedback is player death - yours or your
opponent's - and losing/winning the level (although that
reflects more on your overall combination of strategies
throughout the three stages; losing/winning is the feedback for
your high-level strategy; this is partly why the multiple lives
are necessary: so that you get feedback on both your low-level
and your high-level strategies at once).
Dune/Dune2000:
------
Noah pointed out: "one of the hardest things to do is tune a game,
particularly a strategy game, so that it doesn't quickly devolve
into a simple strategy (like building cheap units and rushing the
other side's base) that is always most effective."
Without a thorough understanding of the problem we're trying to
avoid, people have tried very hard to avoid a slightly different
problem and produced bad games as a result. When Dune2000 was
developed, the "tank-rush" tactic for RTS games was a hot topic
because of it's continued devastating effectiveness in almost
every RTS (Starcraft hadn't been released yet; both games came out
in 1998 [ http://www.mobygames.com/game/versions/gameId,378/ and
http://www.mobygames.com/game/versions/gameId,331/ ]). I've never
spoken to the developers, but the game-design changes for Dune2000
suggest that weakening this tactic - or removing it altogether -
were a high priority. But if so, they were trying to solve the
wrong problem:
I'm still an avid fan of Dune2, and although I enjoyed Dune2000,
it managed not only to "make no single strategy universally
effective" but in the process made "almost no single strategy EVER
effective". If you contrast 2 and 2000 (rather a nice case study
actually, given the amount that DIDN'T change, and the big
separation in time between release, allowing people to think about
other things for a long time), you see that 2 had several nice
emergent strategies. For instance, I don't believe that anyone
deliberately included the "turret-rush" tactic - building a line
of concrete (undestroyable) up to the opponent's base, then
building a load of rocket turrets at the end.
The tactic wasn't as devastating as a tank rush, since you could
only build one turret at a time, and if the opponent could blow up
the first turret you built before it did significant damage, the
could repair the buildings hit before your next turret appeared.
On the other hand, if you targetted all your turrets at a
seemingly unimportant building (one the opponent - even the AI -
didn't care about losing), you could destroy that, and then use a
2x2 concrete block which could "jump" your building-radius over
enemy structures, since it could be placed even if only one square
was unoccupied (the other 3 were ignored). Then you could build
your next turret in the spot previously occupied by the
un-important building. This way you could advance your turrets
deeper and deeper into the base, and hopefully eventually place
one in a position where it was actually only targettable by a
small number of enemy units (since enemies couldn't move "through"
buildings). Then it could sit there, firing and being repaired,
and very slightly swing the tide in your favour.
Persevere long enough, and get lucky occasionally, and you could
eventually overrun most bases with "turret-attacks" - but it was
never a certainty, and accidentally destorying the wrong building
and letting all the opponent's vehicles in too close and you lost
your chance forever.
I think the single saving grace for Dune2, tactically, was the
inability to select multiple units. This was of course also the
most annoying UI feature / design flaw. But it weakened the tank
rush to the status of being only one of several good tactics. (on
a side-note, Shogun:Total War implements a similar brake on
tank-rushes, by making individuals slow down a lot and stop
fighting when trying to manoeuvre in the midst of a pitched
battle. If your units get too enmeshed in the thick of things,
their attack rating drops because they spend half their time
turning back and forth trying to aim on a different unit.)
Dune2000 had such an intricate complex web of units doing more or
less damage depending on what they were targetting that I know
lots of people who got fed up with watching a huge battle between
hundreds of units come down to the presence or abscence of a
single enemy (who you probably didn't even pick out amongst all
the other units). Tank rushes were almost worthless, even with a
variety of units, because it was too hard to get the right units
to the front every time the "nearest enemy" changed from one unit
type to another - if you didn't, your huge tanks would get blown
to pieces by a single soldier, etc.
Quake:
------
Quake was dangerously close to violating aspect 5 (related
strategies should not dead-end the player; i.e. they should always
have an opportunity to escape), but in the end it was saved by
respawning items and good level design (specifically the placement
of item and player spawn points): note that all the "good" Quake
levels place the most mutually-destructive weapons/powerups far
apart, so that even if a player has 200 armour, the lightning gun
and quad damage, they cannot hold onto the advantage forever. The
Quad times out before it reappears, so if they sit still (to get
the next quad) the opponent can go and pick up the 200 armour AND
the lightning gun too, and knows that sooner or later the first
player will become equally vulnerable for long enough to kill
them. If the first player runs around denying the armour to their
opponent, the opponent runs around as well, picking up weapons and
other armour instead. And if the dominant player does NOT camp the
Quad, the underdog player can grab it for an immense
advantage. Rather like scissors/paper/stone.
Side-note: Doom, of course, had the option of "no respawn" for
items when playing in deathmatch - and proved almost no fun at all
in that mode. No surprise that the option disappeared for Quake,
although of course the "no respawn" was a natural extension of the
single-player gameplay.
Adam M
_______________________________________________
MUD-Dev mailing list
MUD-Dev at kanga.nu
https://www.kanga.nu/lists/listinfo/mud-dev
More information about the mud-dev-archive
mailing list