[DGD] Re: parse string difficulties

Erwin Harte harte at is-here.com
Sun Mar 28 20:54:31 CEST 2004


On Sun, Mar 28, 2004 at 05:20:56PM +0000, Robert Forshaw wrote:
> Despite my best efforts I can't understand how parse string is supposed to 
> work. I thought I *might* know what I was doing, when I put it to the test, 
> but got some unexpected results.
[...]
> I seriously don't know what I'm doing here. It would be most helpful if 
> someone could show me line for line how to write a grammar that interprets 
> my datafiles, that would help me relate to what the function is doing. 
> Anyway, all help is appreciated...

I like a challenge like that and did some experimenting.  This is the
grammar I came up with:

    string query_grammar()
    {   
	return
	    "whitespace = /[\b\r\t ]+/\n" +
	    "newline    = /\n/\n" +
	    "word       = /[a-zA-Z0-9]+/\n" +
	    "operator   = /[\\.\\+\\=\\-]+/\n" +

	    "SENTENCE   : OPERATION          ? fun_a\n" +
	    "SENTENCE   : SENTENCE OPERATION ? fun_b\n" +

	    "OPERATION  : word operator word newline ? fun_1\n" +
	    "OPERATION  : word operator      newline ? fun_2\n" +
	    "OPERATION  :      operator word newline ? fun_3\n";
    }

You need to double-escape the ., +, = and - so that the parse_string()
kfun actually _sees_ \. while "\." is identical to "." (hope that made
sense).  You didn't include digits in your original word regexp.

I took the newline out of the whitespace regexp so that it could be
used separately and avoid grammar confusion between

  word operator word
  operator word

and

  word operator
  word operator word

which would otherwise be impossible to distinguish reliably.

The fun_a and fun_b functions create and append to lists of
word/operator/word combinations.

    static mixed *fun_a(mixed *tree)
    {   
	return ({ tree });
    }

    static mixed *fun_b(mixed *tree)
    {   
	return ({ tree[0] + ({ tree[1] }) });
    }

The fun_1, fun_2 and fun_3 functions fill in the blanks (nils) where
appropriate and create 3-tuples (3-sized arrays) in the order you
wanted.

    static mixed *fun_1(mixed *tree)
    {   
	return ({ ({ tree[1], tree[0], tree[2] }) });
    }

    static mixed *fun_2(mixed *tree)
    {   
	return ({ ({ tree[1], tree[0], nil }) });
    }

    static mixed *fun_3(mixed *tree)
    {   
	return ({ ({ tree[0], nil, tree[1] }) });
    }

Throwing something like ".food\nweight=8\n.chocolate\n" at it, it
returns to me with:

  ({ ({ ({ ".", nil, "food" }),
        ({ "=", "weight", "8" }),
        ({ ".", nil, "chocolate" }) }) })

In general:

    static mixed *parse_text(string text)
    {
        mixed result;

        result = parse_string(query_grammar(), text);
        return result ? result[0] : nil;
    }

Hope that helps,

Erwin.
-- 
Erwin Harte <harte at is-here.com>
_________________________________________________________________
List config page:  http://list.imaginary.com/mailman/listinfo/dgd



More information about the DGD mailing list