[MUD-Dev] Metrics for assessing game design [was: When will new MMORPGs that are coming out get original with the gameplay?]

Wed Jul 16 10:07:46 CEST 2003

On Wed, 09 Jul 2003 09:55:00 +0100 ceo  <ceo at grexengine.com> wrote:
> J C Lawrence wrote:

>> We have no external objective metrics which can either detect or
>> demonstrate a "good game design" (where good is defined as fun,
>> large enough player base, profitable, sustainable etc) without
>> implementing and thus sustaining the costs of that game.

> I have some, and they *seem* to work well, but so far my attempts
> to get them spread widely, or discussed in detail (e.g. conference
> talks) have met with a complete lack of interest...

Warning: this is very much a work-under-investigation. I'm
particularly interested in examples where it doesn't seem to work (I
haven't found any so far, but there's an awful lot of games to try
testing it on! So thre could easily be thousands...;)). It has been
aimed at general games, not just MUDs.

The following is culled from the documents I'm currently maintaining
on this; in some cases I've glossed over the details (this post is
far too long already!), particulary ommitting analysis of any other
games (which is lengthy and less complete; there's so much to say
when analysing with this framework!).

Hopefully there's enough to be controversial, somehow :).

  P.S. About 50% of those who've dismissed this out of hand have
  been professional games developers who felt that "of course it's
  not possible to rate fun, and never will be; it's something you
  feel in your gut, not something you can analyse...[subtext: how
  dare you imply that my job as a games designer can be pared down
  to a few simple rules; it's much more complex than that]". I only
  really mean this as the combination of a metric, plus that the act
  of evaluating against it generates guidelines for how you can
  improve a game's score.

--------------------------------------

Theory: The aspects that make a game fun are the same as those that make
a game highly suited to AI-optimization via Genetic Programming.

Namely:

  1. There are always multiple possible EFFECTIVE strategies
  available

  2. Every action taken in the direction of a particular strategy
  alters the available strategies (usually by making some harder to
  achieve, or by making some disappear, or making new ones become
  available)

  3. There is no "universally optimal" strategy
  3. There is no "universally optimal" strategy
  3. There is no "universally optimal" strategy
  3. There is no "universally optimal" strategy

  4. It is usually not trivially obvious which strategy is optimal
  at the current moment in time

  5. No significantly large set of dis-similar but related
  strategies should dead-end in the same situation (closely related
  to 2 above) - it's very positive if this almost happens, but then
  gets knocked out of that state, either through game-mechanics, or
  the quick death of the player (but if so, this situation should
  only happen rarely!).

Evaluate a game by asking:

   1.To what extent does each player-action cause the available
   actions to change?

   2.To what extent are MULTIPLE and UNIQUE strategies continuously
   available to the player?

   3. To what extent do small refinements of your strategy result in
   steady, significant gains in your MEASURABLE game achievement?
   (if the only gains are immeasurable, the game scores poorly,
   because it is not providing enough feedback to the player;
   non-quantitive, qualitative things like completing a level DO
   count as "measurable" achievements).
   count as "measurable" achievements).
   count as "measurable" achievements).
   count as "measurable" achievements).

Detailed thoughts on each question:

  1: Dune 2000 has a good example of this without the word
  "effective". It didn't do well as a game, despite seeming to be
  excellent at first glance. (See the analysis below).

  2: This is an alternative to looking at whether gameplay is
  "non-linear". I've seen much debate over whether nonlinearity is
  inherently a good or bad thing; the lack of consensus after a
  decade or more seems to indicate it was the wrong question to be
  asking in the first place. The idSoftware Quake level designer who
  was proud of making "simple" challenges in his levels repeatedly
  fell afoul of this, and single-handedly put me off it as a
  single-player game.

  3: There may be "obviously optimal" short-term strategies and
  long-term strategies, but none that are both, at least not for the
  majority of runs through the game (one of the achievements of
  aspect 3 is to provide a game where the player can't just "go on
  automatic" to play the game, no matter how experienced they are)

  3: There is often a subset of strategies which it is clear
  contains the optimal strategy, but without it being clear
  precisely which strategy that is. I suspect this should be
  promoted to an aspect of it's own - it works as a good corollary
  for the others: it encourages players to take actions (to attempt
  to work out which of the "good set" is actually the best one),
  thereby exercising 2 above. Even better is if one member of the
  set is quite poor - but ONLY if that can be quickly deduced once
  the player starts pursuing it. Dune 2000, for instance, only
  informed you that a given strategy was poor about 10 minutes after
  you started producing your troops, when they finally got into
  battle and were slaughtered. The presence of the "poor" option is
  only a good thing if it adds to the interest of deducing the best
  option: it does not have any other relevance to the game (and so
  should not cause game-losing situations, like in Dune2000). [*]

  3: Usually there *is* a short-term strategy that is obviously the
  best - but comes with considerable *obvious* penalization in the
  long-term. Some players will adopt it straight away, and discover
  the consequences through repeated play. Others will avoid it
  initially, because the long-term problems are so obvious. In
  multiplayer-play, the tendency is for both types of player to
  adopt less extreme behaviour (the former often wipes the floor
  with the latter, but later on always ends up losing the game to a
  third party; they get so much experience of the end-game that they
  are forced to reassess). [*]

  5: By "related" I mean that from using one, you are allowed to
  switch to one of the others (since in many dangerous situations
  your options disappear rapidly!). By "dead-end" I mean "lock you
  into a losing situation permanently". It's OK if the situation is
  temporary - perhaps greatly weakening your character/troops/etc in
  the process. In evolutionary programming, it is fairly obvious
  that populations get stuck in evolutionary dead-ends quite
  often. The normal tactic there is to monitor the statistics of the
  population, and if the variance (usually of their "success" in the
  task) over time decreases too much, switch temporarily to more
  random selection, causing the population to escape the niche
  (although there are many papers on more advanced ways of avoiding
  these "local maxima in fitness landscape" [search terms to use on
  google, even though the terminology ought to be "fitness range"]).

  5: Noah Falstein pointed out that "The value of randomness in
  keeping you out of dead-end loops is a well-known technique among
  AI programmers.  I never thought of it in the sense of how it
  resembles random mutation."

  5: An example of making the dead-end temporary might be in a
  scenario where the enemy gets some advantage whose knock-on effect
  causes them to rapidly obliterate your troops. If this scenario
  happens often when playing a certain level, or if once all your
  troops die, you start again in the same situation as when you
  died, then the specific "advantage" that the enemy had should be
  tied to the ratio in "numbers of troops" between him and you -
  this provides a natural get-out point, enabling the game to carry
  on flowing, but still gives the enemy (who managed to manouevre
  into the situation) a big reward (he decimated your troops at
  little or no cost to his own).

  [*]: in particular, this is one of the areas I'm still looking
  into; I haven't done enough analysis of how the
  presence/abscence/multiplicity of this occurring effects how fun
  the game is. I intend to start from the GP perspective, since that
  has born everyting else here!, and see if the results from GP
  continue to usefully generalize to looking at fun.

Worked examples:

Bomberman / Moonquake / etc:
------

  Bomberman scores very on question one; every bomb re-shapes the
  level (Which DOES matter because it alters how and where you will
  meet your enemy). Even movement scores moderately well, because of
  the slow speed of movement. Even choosing which powerup to pick up
  scores well, because the fuse on bombs is so slow that you have
  time to massively alter the effect of your bomb (by picking up new
  powerups) before it goes off - and when you start, the other
  player could be "trapped, but safe" but 3 seconds later "trapped,
  player could be "trapped, but safe" but 3 seconds later "trapped,
  player could be "trapped, but safe" but 3 seconds later "trapped,
  player could be "trapped, but safe" but 3 seconds later "trapped,
  and about to die".
  and about to die".

  The framework/metrics above suggest that bomberman could be
  improved by adding "powerdowns". These would increase the ability
  to alter the available actions: they give you some potential
  opportunity to "conserve" powerdowns in case you became trapped by
  a combination of your own bombs and other people's. You could then
  downgrade your bombs, enabling you to stand closer to your bomb
  without being caught in the explosion, and possibly even providing
  safe space for you to survive.

  The gameplay is split into 3 stages:

    stage1: no contact possible between players (no connection
    between their areas of the board)

    stage2: contact possible, unexploded blocks remain

    stage3: no blocks remain, mano a mano duelling

  In Stage1, there are typically 5 strategies available:

    a. Collect bomb powerups

    b. Collect firepower powerups

    c. Be as efficient as possible, aiming to get most powerups
    possible before game moves to stage2

    d. Gain as many powerups as possible whilst sacrificing SOME
    efficiency to ensure you lose no lives through accidents

    e. (Some versions only): Locate "special" powerups. E.g.,
    Moonquake had a "question mark" which had a random effect: it
    might kill you, or it might kill your opponent, or reverse your
    controls, or give you an extra life, etc. Assuming weightings
    that make "certain death" less likely than the chance to kill
    your opponent (which was not guaranteed - it required some skill
    and timing!), this allows an additional strategy of "reducing
    your opponent's lives before the game even makes it to stage2"

  It's worth noting that each bomb, if ideally positioned, reveals
  potentially three powerups at once. This gives the game a
  particularly high score on question 2 - because a single action
  often provides many additional powerups.

  In addition, because the player can only move relatively slowly
  (excepting versions where speed powerups were extreme), when these
  multiple powerups appear simultaneously, it is a major decision as
  to which should be picked up first. Stage1 often switches to
  stage2 with 5-10 uncollected powerups on the board.

  As for question 3, the main gains from improved strategy are:

    stage1: decreasing your careless mistakes results in losing
    fewer lives! NOTE: you have multiple lives per level, so that
    slightly more fine feedback is provided than "you died; level
    over". The metrics suggest that these multiple lives are a
    particular benefit.

    stage2: Several opportunities to refine/experiment with the
    strategies from stages 1 and 3, most of which run in parallel
    during stage2. Price of failure is again mainly shown by loss of
    lives, but also from the extremely obvious graphical display of
    increasing bomb-power compared to your opponent (both players
    will be laying bombs often in almost all strategies, so the
    feedback of your comparitive power is quick and fine-grained)

    stage3: The only feedback is player death - yours or your
    opponent's - and losing/winning the level (although that
    reflects more on your overall combination of strategies
    throughout the three stages; losing/winning is the feedback for
    your high-level strategy; this is partly why the multiple lives
    are necessary: so that you get feedback on both your low-level
    and your high-level strategies at once).

Dune/Dune2000:
------

  Noah pointed out: "one of the hardest things to do is tune a game,
  particularly a strategy game, so that it doesn't quickly devolve
  into a simple strategy (like building cheap units and rushing the
  other side's base) that is always most effective."

  Without a thorough understanding of the problem we're trying to
  avoid, people have tried very hard to avoid a slightly different
  problem and produced bad games as a result. When Dune2000 was
  developed, the "tank-rush" tactic for RTS games was a hot topic
  because of it's continued devastating effectiveness in almost
  every RTS (Starcraft hadn't been released yet; both games came out
  in 1998 [ http://www.mobygames.com/game/versions/gameId,378/ and
  http://www.mobygames.com/game/versions/gameId,331/ ]). I've never
  spoken to the developers, but the game-design changes for Dune2000
  suggest that weakening this tactic - or removing it altogether -
  were a high priority. But if so, they were trying to solve the
  wrong problem:

  I'm still an avid fan of Dune2, and although I enjoyed Dune2000,
  it managed not only to "make no single strategy universally
  effective" but in the process made "almost no single strategy EVER
  effective". If you contrast 2 and 2000 (rather a nice case study
  actually, given the amount that DIDN'T change, and the big
  separation in time between release, allowing people to think about
  other things for a long time), you see that 2 had several nice
  emergent strategies. For instance, I don't believe that anyone
  deliberately included the "turret-rush" tactic - building a line
  of concrete (undestroyable) up to the opponent's base, then
  building a load of rocket turrets at the end.

  The tactic wasn't as devastating as a tank rush, since you could
  only build one turret at a time, and if the opponent could blow up
  the first turret you built before it did significant damage, the
  could repair the buildings hit before your next turret appeared.

  On the other hand, if you targetted all your turrets at a
  seemingly unimportant building (one the opponent - even the AI -
  didn't care about losing), you could destroy that, and then use a
  2x2 concrete block which could "jump" your building-radius over
  enemy structures, since it could be placed even if only one square
  was unoccupied (the other 3 were ignored). Then you could build
  your next turret in the spot previously occupied by the
  un-important building. This way you could advance your turrets
  deeper and deeper into the base, and hopefully eventually place
  one in a position where it was actually only targettable by a
  small number of enemy units (since enemies couldn't move "through"
  buildings).  Then it could sit there, firing and being repaired,
  and very slightly swing the tide in your favour.

  Persevere long enough, and get lucky occasionally, and you could
  eventually overrun most bases with "turret-attacks" - but it was
  never a certainty, and accidentally destorying the wrong building
  and letting all the opponent's vehicles in too close and you lost
  your chance forever.

  I think the single saving grace for Dune2, tactically, was the
  inability to select multiple units. This was of course also the
  most annoying UI feature / design flaw. But it weakened the tank
  rush to the status of being only one of several good tactics. (on
  a side-note, Shogun:Total War implements a similar brake on
  tank-rushes, by making individuals slow down a lot and stop
  fighting when trying to manoeuvre in the midst of a pitched
  battle. If your units get too enmeshed in the thick of things,
  their attack rating drops because they spend half their time
  turning back and forth trying to aim on a different unit.)

  Dune2000 had such an intricate complex web of units doing more or
  less damage depending on what they were targetting that I know
  lots of people who got fed up with watching a huge battle between
  hundreds of units come down to the presence or abscence of a
  single enemy (who you probably didn't even pick out amongst all
  the other units). Tank rushes were almost worthless, even with a
  variety of units, because it was too hard to get the right units
  to the front every time the "nearest enemy" changed from one unit
  type to another - if you didn't, your huge tanks would get blown
  to pieces by a single soldier, etc.

Quake:
------

  Quake was dangerously close to violating aspect 5 (related
  strategies should not dead-end the player; i.e. they should always
  have an opportunity to escape), but in the end it was saved by
  respawning items and good level design (specifically the placement
  of item and player spawn points): note that all the "good" Quake
  levels place the most mutually-destructive weapons/powerups far
  apart, so that even if a player has 200 armour, the lightning gun
  and quad damage, they cannot hold onto the advantage forever. The
  Quad times out before it reappears, so if they sit still (to get
  the next quad) the opponent can go and pick up the 200 armour AND
  the lightning gun too, and knows that sooner or later the first
  player will become equally vulnerable for long enough to kill
  them. If the first player runs around denying the armour to their
  opponent, the opponent runs around as well, picking up weapons and
  other armour instead. And if the dominant player does NOT camp the
  Quad, the underdog player can grab it for an immense
  advantage. Rather like scissors/paper/stone.

  Side-note: Doom, of course, had the option of "no respawn" for
  items when playing in deathmatch - and proved almost no fun at all
  in that mode. No surprise that the option disappeared for Quake,
  although of course the "no respawn" was a natural extension of the
  single-player gameplay.

Adam M
_______________________________________________
MUD-Dev mailing list
MUD-Dev at kanga.nu
https://www.kanga.nu/lists/listinfo/mud-dev