[DGD] Working with parse_string()

Kamil N kamiln at gmail.com
Wed Jul 22 02:57:16 CEST 2009


One more parsing question, which is connected to my recent post about
"heavy axe" and parsing function running twice for it - is there maybe
a way to "gracefully" recover from reaching "bottom of parsing tree"?
I'm having a big problem with handling this kind of situation - I will
try to describe it now..

Consider this ambigous sentence (all the time polish "give" command):

give sharp long *swoerd tall sneaky man

*(thats not a mistake, I simulate typo in name to make it impossible
to find the object)

I assume that my grammar manipulates objects described by 1-3 words
(noun, adj1 noun, adj1 adj2 noun). This means that above sentence has
_only one_ possible parsing tree:

({ "sharp", "long", "swoerd" }) for OBJ and ({ "tall", "sneaky", "man"
}) for LIV

For purposes of more ambigous grammar, I assume (at leat I observed it
doing so) that parser starts from biggest possible token count it can
(if it can grab 3 words describing object it will do so, even if
actual object is described by 2 words, it will add unnecessary words
if it has enough tokens). So, sometimes its necessary to return NIL to
force parser to narrow tokens and find proper item.

But this doesn't work in above example, because there will be one and
only token-set matching the rules, and returning NIL in that case
(because there is no object "swoerd") results in global parser
failure. I tried using some catch-all rules in addition to "OBJ LIV"
but its not really clean solution, especially that I could do alot of
things if I had some context inside LPC function (for example
information how many tokens are stored to use for following RULE, or
that its end of chain and returning nil there will result in error).

A little example how I use returning NIL in cases where it actually works:

GIVE SHARP SWORD TALL SNEAKY MAN

With rule OBJ LIV this will call finder function first with these arguments:

({ "sharp", "sword", "tall" })

because "sneaky man" is still valid token input for LIV rule, so it
reserves 3 tokens for OBJ rule. Because such object ("tall") doesn't
exist I return NIL to force parser to narrow tokens as much as it can:

({ "sharp", "sword" }) -> object found

The same rule would work for one-word object ("give sword to tall sneaky man"):

({ "sword", "tall", "sneaky" })  => not found, return nil (narrowing)
({ "sword", "tall" })    => not found, return nil (narrowing)
({ "sword" })    => found!

I really miss the possibility to determine if I'm at the bottom of
parse tree in LPC function, because this would allow me to gracefully
end parsing that tree if its impossible to further adjust results.
Continuing with above example, this could work as follows:

LPC function find_object receives ({ "sharp", "long", "swoerd" }) as
argument, as well as information that there are no more possibilities
to parse this rule further (so returning nil here will result in
parse_string giving error/returning nil). Having this information I
can decide to return empty array ({ }) instead of NIL and proceed to
another rule or finish parsing if it was the last rule. I tried to
base this decision on size of tree array - it was making it work in
this case, but stopped in another, so thats not really a solution.

To finish this post I'd like to say that I'm quite successful now with
parsing _existing_ objects, all my commands work if only I type real
objects. As soon as I make some erroneous input, mistake in both OBJ
to give and LIV to receive, using shorter and longer names (in range
of 1 to 3 words) it starts to get messy, because LPC functions that
are doing all lookup start to return nil in places where it causes
parse_string to just fail).

I did alot of magic till now with parse_string and I think I'm
starting to understand it a bit more, but I just can't find a way to
gracefully recover from some rules where bad input is given and no
objects can be found. :( I wouldn't be surprised if its possible by
doing some tricky rules & arranging them properly, maybe you have some
nice ideas how to achieve this.

Regards,
KN



More information about the DGD mailing list