[MUD-Dev] (fwd) Re: command parsers: a modest proposal (with apologies to J. Swift)
J C Lawrence
claw at under.engr.sgi.com
Tue Jul 7 11:14:17 CEST 1998
From: Richard Bartle <76703.3042 at CompuServe.COM>
Subject: Re: command parsers: a modest proposal (with apologies to J. Swift)
Newsgroups: rec.games.mud.admin
Date: Sat, 04 Jul 1998 18:48:44 -0400
"Ilya, SCC, Game Commandos" <ilya at gamecommandos.com> wrote:
>But you never mentioned how it was that MUD2 handled the
>idea of the forgiving parser.
Well OK, I'm happy to explain.
It could be MORE forgiving, I hasten to add, but it's better than
most of what's out there.
>Do you implement any aspect of this?
I implement all aspects that I described in my earlier post.
>Are several likely synonyms of command words all functional (or all
>suggesting the proper alternative syntax)? Or do similar words do vastly
>different things such?
I have two approaches.
The first is the straight synonym, or abbreviation. Example: if
someone can't be bothered to type GET, they can type G. In that case, G is
a vocabulary item which points at the same (command) object that GET does.
Similarly, DROP, DR, GIVE, GI and INSERT are all synonyms for the action of
transferring an object from the person carrying it to some other container
(or person, or room - they're just specialisations of the class CONTAINER).
The second is through the class hierarchy. Example: in most cases
if you TOSS an object, it should execute the same code that THROW does
(which in turn normally behaves like DROP does) except in the special case
where the object being tossed is a subclass of COIN. So I can define:
{ drop object }:
<default code for DROP OBJECT here>
{ throw firework }:
<something telling you not to throw fireworks here>
{ toss coin }:
<something telling you whether it came up heads or tails here>
Thus, if someone does TOSS COIN they get the message about heads or
tails; if they do TOSS FIREWORK they get the message about not throwing
fireworks; if they do TOSS BANANA they get the default code for DROP
BANANA. However, if they typed DROP COIN then it would just drop it, without
the message.
>Example: if a character wants to start a fight with herman,
>another character, what happens with these various commands:
>hit herman, kick herman, slap herman, poke herman, punch
>herman, chop herman, attack herman, kill herman, murder
>herman, combat herman, fight herman, etc.?
OK, well I implement HIT as a one-time hit, and KILL as initiating a
fight. This distinction only makes sense for players and mobiles, though, so
for ordinary objects like doors and boxes they behave the same. I therefore
make HIT a subclass of KILL and define:
{ KILL OBJECT }:
<damages the object>
{ KILL CREATURE }:
<initiate round-based combat routines>
{ HIT CREATURE }:
<damage the creature once, but don't start combat>
For the other commands you mentioned, ATTACK, MURDER, COMBAT and
FIGHT are all direct synonyms of KILL. I make KICK, SLAP, PUNCH and CHOP
be subclasses of HIT, so I can tell the person who receives the blow what
kind of blow it was. POKE is a implemented as a separate command which tells
the person who was poked that they were poked, but does no damage; you
might use it to wake someone up, for example.
>(I mention these because some games implement kick as
>a combat command, some implement it as a 'social;' some
>games require murder for player-vs-player combat, others
>do not; some use attack, some kill, and so forth)
Well that's not really a parser issue, then; it's just a matter of
how each particular implementor chooses to do it.
>Example: if a character wants to pick up a stick off
>of the ground, what happens with these commands: get stick,
>take stick, obtain stick, grab stick, pick up stick, get
>stick from ground, grab stick off of ground, etc.?
GET, TAKE, OBTAIN and GRAB are direct synonyms. PICK followed by the
adverb UP is folded to give an internal command object, PICKUP, which is a
subclass of GET; this is so I can trap various other interpretations of
"pick up" on specific objects (eg. players) before I let it through to the
default GET; I could bind PICK UP directly to GET if I chose.
GET STICK FROM GROUND I'll have to check, but off the top of my
head I think it does the same as GET STICK except it complains if there is
no ground, eg. you're falling off a cliff or you're at sea in a boat. I'm
not sure, though - it's ages since I wrote that code.
GRAB STICK OFF OF GROUND would, I believe, fail to parse, complaining
that OFF is a preposition and it's being followed by another preposition
when it was expecting maybe a noun. It would take me maybe a minute to get
it to parse (by declaring OFF to be an adverb as well as a preposition, and
recompiling).
>Not to mention the simple variations with 'a,' 'an,' or 'the'
>articles included or not (get the stick, take the stick,
>take a stick, obtain the stick, grab the stick, etc.)
My parser has a weak spot with plurals, in that I treat every
object as if it were plural. Thus, GET STICK in a room with 3 sticks in it
would pick them all up. If you just wanted one, then GET A STICK or GET 1
STICK would work, as would use of a discriminating adjective like GET LIT
STICK or GET HEAVIEST STICK.
>(Again, none of these is an unreasonable guess on the part
>of the player. But implementations abound where they have
>vastly different meanings.
Yes, I know, although I'm not sure whether all of these are parser
issues: some are just issues of choice on the part of the implementer for
dealing with genuinely ambiguous words.
>'Pick up x' is often interpreted as 'use lockpicking skills to pick the
>lock named 'up'';
Well if they implemented a class hierarchy for commands, that
wouldn't happen. If PICK is a subclass of GET then:
{ PICK LOCK }:
<code for lockpicking>
{ PICK FLOWER }:
<code for picking a flower>
{ GET OBJECT }:
<code for picking up something>
What's more, if the parser actually pays attention to the adverbs,
PICK LOCK and PICK UP LOCK need not amount to the same thing.
>grab may be a social, not a synonym for take;
Sigh...
{ GRAB PLAYER }:
<socialisation code here>
{ GRAB OBJECT }:
call drop(object)
>Please note that I do not necessarily espouse that all
>potential synonyms always be active -- only that every effort
>be made to reduce the pain required on the part of the
>player to guess the proper command
I really must process that copy of Roget's Thesaurus I downloaded a
couple of years ago! I've added many synonyms and subclasses for commands
over the years, but not exhaustively. Still, it's pretty good: if someone
types GENUFLECT instead of BOW, it works...
>So, since you've entered into the exchange, Mr. Bartle,
Hmm, well if you're going to be formal it's Dr Bartle..!
>do complete your answer with a description of your venerable
>MUD2's way of implementing anything along the lines of a
>forgiving parser.
I get the feeling you may have the impression that I don't know
what I'm talking about...
Oh well, here goes!
There are several stages to parsing in a MUD. Ideally, they should
all be interconnected so you can backtrack from one to the other, but in
practice they tend to be dealt with by different processes, perhaps even on
different machines, so it's never going to be as good as it might be. Here's
how MUD2 does it, with reference to its forgivingness. I've simplified things
a little (it allows certain verbs to enquote the rest of the sentence, for
example, which is a pain to implement), but this is roughly what it does:
Stage 1: tokenising
The input line is split into tokens, ie. collections of symbols
which are potentially meaningful. Whitespace and garbage characters are
stripped away.
Unforgiving parsers might complain about "GET BOX" with two spaces
after "GET", or the fact that "GET" is in upper case.
Stage 2: dictionary lookup
The symbols in the tokenised command line are looked up in the
the game's dictionary/vocabulary. If they're not there, then they must
either be ignored, guessed at, or the parsing process must stop. Unforgiving
parsers will bomb out with an error message. As it happens, MUD2 does that,
too. In MUD1, the words were ignored, but I found that most problems of this
nature were due to players' typing mistakes. In MUD2, I added synonyms for
most of the common errors ("teh" for "the" and so on), but stopped short of
trying to guess what people meant because there were potential dangers if
the wrong guess was made. If someone types KI JIM meaning KISS JIM, and the
game decides that KILL JIM is more likely, well, you get the picture...
People seem to prefer to have the error pointed out so they can use
line-editing and correct it. This may not be the case on other MUDs, with
different player cultures.
Stage 3: parsing
The dictionary entry for each symbol returns a set of parts of
speech which this symbol can take. Many English words have different
meanings depending on where they are in a sentence (eg. PLANT can be a
verb, adjective or noun), and in a MUD this is compounded by abbreviations
(eg. F is short for the verb FLEE and the preposition FROM). The job of the
parser is to determine which parts of speech apply for each command line.
For MUD2, I use an implicit, backtracking grammar so arranged as to find the
most common forms first (eg. VERB NOUN before ADVERB VERB). I parse the
whole input line, and as a result will get either a failed parse or a
successful parse. There's not much I can do about a failed parse: if
someone types DOOR THE OPEN KEY WITH then a human may be able to figure out
what's meant, but it would bloat a parser to give it that capacity. What
I do, therefore, is tell the players what the parser DID understand, and
what it was expecting the symbol it failed on to be. Players can then
attempt to rephrase what they typed so as to fit what the parser can cope
with. It beats "I don't understand that, please try saying it a different
way". Since my parser knows about verbs, nouns, adjectives, adverbs,
prepositions, definite/indefinite objects, pronouns, superlatives, numbers,
conjunctions and punctuation, it can handle a reasonable range of
imperative sentences. DROP THE BIGGEST OF THE LIT STICKS IN THE BOAT THEN
DROP IT AND THE LEAST BEST WEAPON IN THE RIVER would parse (and be
successfully executed, not that people often bother typing stuff in with that
kind of complexity).
For synonyms, this is the stage where they are reconciled with
their main meaning. The parts of speech returned by a successful parse of
GET LONGSWORD are identical to those returned by G LS.
Stage 4: binding
The result of the parse is a set of sentences/commands. These are
handled one at a time. A command consists of a verb, followed by a list of
verb qualifiers (adverbs and prepositions) followed by a list of noun groups.
Noun groups are a noun, followed by a list of noun qualifiers (adjectives,
definite/indefinite objects etc.). The binder's job is to associate a
word with a real object. If I say GET KEY, then the binder has to find
which key(s) I mean, in the context of the verb; DROP KEY would bind to a
different key (one I'm holding, rather than one on the floor).
OK, so first I apply all the modifiers to the verb, to get a new
verb. Most modifiers do nothing, but some are important - THROW KNIFE AT
MARY is not the same as THROW KNIFE TO MARY. Once I have the modified
verb, this determines the point at which the search for objects begins
(eg. the player for DROP, the room for GET). The search traverses all
opened, accessible containers from the search point, and finds all objects
which are instances of the class implied by the noun. DROP GOLD would bind a
list of all objects of class GOLD that you were carrying to the first noun
slot. Once this set of noun bindings has been created, it is filtered by the
noun qualifier list. If you said GET GOLDEN FORK and there were three forks
in the room, then these three would be the initial set, but applying the
adjective (function) GOLDEN to each element would produce a new set with
only golden forks in it. A similar process works for superlatives, and for
excluding prepositions (DROP EVERYTHING BUT MY SWORD).
Some nouns are dynamic, and have to be bound on the fly. There are
no objects of class FOE, for example, but if I were fighting an invisible
player I may want to reference them as the target for a spell, say. If I
don't know their name, what can I do? Well, using the word FOE will bind to
all creatures with whom you are currently engaged in combat. I could thus
BLIND FOE to try cast a BLIND spell at them. Pronouns are handled in a
similar fashion.
At the end of the parse, I have a verb, and an optional number of
sets of objects. If any of these sets are empty, then the player has
referenced an object that they have no access to: that, therefore, is what
I tell them. KILL OX. "You don't see any ox.". HIT BILL WITH CARROT.
"You don't see any carrot.". There ARE carrots and oxen in the game, just
not where you are.
Stage 5: despatching
At this stage, I have a verb, and 0 or more non-empty sets of nouns
(in fact, the way MUD2 does it, it's limited to 0, 1 or 2 sets of nouns: it
can't handle stuff like PUT THE EGG IN THE PAN WITH THE SPOON, but a day's
work would enable it to do so). I now have to decide what actual functions
to call. As I mentioned earlier, functions are legitimate objects in MUD2,
so rather than going through all the objects telling them to invoke the
function on them, I simply tell the function to invoke itself with each
objects as a parameter, in turn.
Example: verb=GET, first noun set=[KEY1, KEY2, KEY3]
Invoke: GET(KEY1)
GET(KEY2)
GET(KEY3)
I've used the traditional syntax for calling functions, there; it's
really more like:
call(GET, KEY1)
call(GET, KEY2)
call(GET, KEY3)
If there are no noun groups, it's trivial to call the verb with no
parameters. There is something of an issue, though, if there are two sets
of nouns, eg. GET INSECTS WITH NETS when there are 3 insects and 2 nets.
What does it mean? It's not exactly clear what it means in English, let
alone in a MUD! The solution I adopt is fairly arbitrary, but then any
solution will be arbitrary in some respect. I call GET three times,
one for each insect, and use the first net only. If, however, the net is
somehow destroyed by getting one of the insects (eg. it breaks), I would
switch to the second net. Also, binding would cease if the player were to
die as a result of executing a command.
Unforgiving parsers may not be able to despatch on functions, which
means that the same verb cannot affect radically different objects in
different ways. What's appropriate for C++ isn't necessarily appropriate for
MUDs.
Stage 6: executing
Execution takes place a function call at a time. Here, the function
call is matched in an hierarchical fashion against function definition
templates, until a match is found. A match is ALWAYS found, because I have
a top-level template which matches everything that hasn't been matched
elsewhere. When the match is found, the code is executed.
Originally, I had an option to rematch, but I found that I never
used it. PROLOG has a cut operator to prevent rematching, and my
operator was like an inverse of this (ie. default was to cut, explicit
operator causes the search to continue). Given that I didn't find I needed
it, and that it added a significant overhead to the matching process, I
removed it. For absolutely purity, though, I should have left it in.
The execution of the command actually effects changes in the
game world. It also tests conditions of applicability: if you want to hit
someone with an axe, and injury has left you too weak to wield the axe, then
the command should fail with an appropriate error message. Similarly, if
you want to open a door and it's already open, you should be told that it's
already open, not that you've opened it. It's at this point that the final
stage of forgivingness takes place, by making intelligent assumptions. If
someone says OPEN DOOR, and the door is locked but they have a key which
fits, then they should be able to open the door. However, if they say OPEN
DOOR WITH SMALL KEY and the small key doesn't fit, they should be told it
doesn't fit, of course, but what if they had another key which does fit? I
decided not to open the door, as they actually specified which key to use,
however other MUD designers might prefer it to open in their game. There's
therefore some debate as to how far to go with this; I tend to err on the
side of convenience. If someone says OPEN then I could search for a closed
door, search for a matching key, and open it. However, in MUD2 there are
other things people can open, some of which are deadly, and therefore I tell
them I need to know what it is they wish to open. It's within the parser's
capacity to say "What do you want me to open? (chest, door) but I try to
avoid doing that. It always annoyed me as a player, and it's not exactly
hard for them to use a line editing command to pull back the OPEN and then
add CHEST or whatever anyway. I realise that there are certain Zork-hardened
players who would advocate the answer-the-question approach, though.
Hmm, I think that's about it!
I hope that helps answers your question.
Richard
--
J C Lawrence Internet: claw at null.net
(Contractor) Internet: coder at ibm.net
---------(*) Internet: claw at under.engr.sgi.com
...Honourary Member of Clan McFud -- Teamer's Avenging Monolith...
More information about the mud-dev-archive
mailing list