[MUD-Dev] Text Parsing

Ross Nicoll rnicoll at lostics.demon.co.uk
Tue Jun 1 21:38:01 CEST 1999


On Tue, 1 Jun 1999, Kylotan wrote:

> > Wouldn't take much memory, but coding time would be very, very long
> > compared to a simple parser, and it would have a much higher CPU usage
> > too (although that may not make so much of a difference).
> Well, I would have thought that something that used to execute within half
> a second on a Z80 CPU would not be a problem, performance-wise.
Really depends on if you're using an interpreted/byte compiled internal
language...

> But then, if I fully understood what was going on here, I wouldn't need
> to be asking :)  I expect, however, that the code in modern
> interactive-fiction engines such as Inform or Tads has a lot more code
> than is needed for a parser of the complexity I am talking about.
Don't think either is really that much ahead actually. Hangon, I'll dig up
some code...

Verb 'take' 'carry' 'hold'
                * multi                          -> Take
                * 'off' worn                     -> Disrobe
                * multiinside 'from' noun        -> Remove
                * multiinside 'off' noun         -> Remove
                * 'inventory'                    -> Inv;


That was a definition for one of the basic functions (suprise suprise).
Basically it defines, for these arguments, use this function. The
difficult bit is working out what arguments are which. Each function can
have globally defined functions that are called before and/or after it is
called, aswell as similar functions of each object.

Hmmm. Okay, I find it easiest to code from examples:

> take the big red chair

I'm pretty sure my MUD could handle that perfectly well. Each object has
it's name (chair, in this case), aliases (chair could also be seat), and
adjectives (big, red and wooden would work here). If you want a copy of
the function that identifies objects, tell me. It's pretty hideous
though...

Another example:

> take cookie then eat cookie

My MUD would fall over at this. It was never hold how to deal with "then".
Simplest method seems to be to split commands up around "then".

> say what happened then
[pauses to shuffle his HDs partitions around]

Would cause problems with that though, as it would create two commands,
"say when happened" and "", so some way of indicating that arguments
should be passed straight onto the function, instead of being parsed,
should probably be included.

At this stage, the program can try sorting through the arguments. So, the
order so far is:

1. Take in a command.
2. Find the function that matches the first word of the command.
3. If the function specifies that arguments should not be parsed, pass the
   arguments straight into the function.
4. Otherwise split up the command into several commands, and parse each
   individually.

Back to the "take" command, as used in Inform. Let's look at the most
simple usage:

> take cup

Anything can parse that. What about a list:

> take cup and saucer

No problem, just split around "and", and call the function for each in
turn. But if we want want three objects, do we write:

> take cup and saucer and spoon

or

> take cup, saucer and spoon

in which case, we have to make sure that the program doesn't start trying
to find "cup," in the room.

Second syntax "take off <object>". Here it gets really fun:

> take off coat

seems simple enough. Thing is, do you first check for an object that can
be described as an "off coat", or first check through the list of possible
arguments to "take" for a match? The first options starts adding reserved
words ("off button" anyone?), the second one has problems with other
syntaxes:

> take small box from wooden table

A search for this syntax would probably be best implemented as a search
for "from", and then a check that the syntax is okay, if a "from" is
found. There is also:

> take gold ring from the small box from wooden table

but quite frankly, anyone typing that in deserves everything they get.

So for parsing the command, after it is split up appropriately, the
sequence that works best is probably:

1. Go through all possible syntaxes of command, checking for a match.
2. Go with first syntax match, or print error message if there are none.
3. Parse lists of objects, splitting around commas and "and"s.
4. Pass the resulting mess to the function.

Although 4 might be a lot simpler if the function was just called once for
each item in a list.

I'd actually really appreciate comments on other possible
problems/solutions with this, because it's looking writable, so I might
just try implementing it all in my latest driver.
--
  _   __  __  __
 /_) / / (_  (__
/\  /_/  __)   /
______________/




_______________________________________________
MUD-Dev maillist  -  MUD-Dev at kanga.nu
http://www.kanga.nu/lists/listinfo/mud-dev




More information about the mud-dev-archive mailing list