[MUD-Dev] Text Parsing

Travis Casey efindel at io.com
Sun May 30 23:39:05 CEST 1999


On Friday, May 28, 1999, Albert wrote:

[stuff about text parsing -- all snipped, since I'm commenting on the
idea in general, not on any specifics]

Someone else pointed out that a thing may have a name that consists of
multiple words.  This applies not just to bad naming on the part of
builders, but is a "feature" of the English language -- "ice cream",
for example, is such a thing.  Neither "ice" nor "cream" is an
adjective in "ice cream" -- it's a single noun consisting of two
words.

Proper names are often like this -- "London Bridge", "Travis Shannon
Casey", and "Grand Canyon" are all proper names for particular things,
and each is a noun consisting of multiple words.


For a parser that I was developing, I used patterns.  I developed it
to the point where there was a function for "trying" an input string
against a particular pattern, which would return an empty array if the
pattern was not matched, or an array filled with the matches for the
variable parts of the pattern if it was matched.  It did as much
processing as possible -- for example, the string:

 get the red sword, the blue book, and the green coin

used with the pattern

 get OBJECTS

would return an array containing pointers to those three objects,
which would itself be the first element of another array.  (It was
done this way so that

 give item 1, item 2, and item 3 to joe

with

 give OBJECTS to LIVING

could return an array of two elements -- the first element would just
be an array with the objects to give.)

Verbs which could take an indirect object simply needed two tests;
thus, for give, I'd try the patterns:

 give OBJECTS to LIVING
 give LIVING OBJECTS

Two "special" variables were PREP and APREP.  PREP could be any
place-indicating preposition; APREP included those plus two "action"
prepositions, "into" and "onto".  Thus, you could have:

 put OBJECTS APREP CONTAINER

The parser was modularized -- the top level tried to break up the
string, using keywords and PREP and APREP to do initial breaking, and
then would try the resulting strings, using a function that would try
to find one or more matching objects.

I started off by implementing the top level, then slowly refined the
object finder, giving it functionality to handle lists with "and",
comma-separated lists, articles, objects in containers (e.g., "get the
sword in the box"), "my sword", "sword on the floor", numbers of
objects ("three swords"), number-indicated objects ("third sword"),
adjectives ("the green sword"), and probably a few other things I just
don't remember at the moment.

The next logical step would have been to have a way to register the
patterns with a central parser and have callbacks -- which is
essentially what MudOS's parser does in v22 and above.

--
       |\      _,,,---,,_        Travis S. Casey  <efindel at io.com>
 ZZzz  /,`.-'`'    -.  ;-;;,_   No one agrees with me.  Not even me.
      |,4-  ) )-,_..;\ (  `'-'
     '---''(_/--'  `-'\_)




_______________________________________________
MUD-Dev maillist  -  MUD-Dev at kanga.nu
http://www.kanga.nu/lists/listinfo/mud-dev




More information about the mud-dev-archive mailing list