[DGD] Newbie question regarding parse_string()

Mon Oct 22 19:34:55 CEST 2001

"Troels Therkelsen" <troels at 2-10.org> wrote:

>[...]
>   word = /[a-zA-Z]+/
>   path = /[^ :]+/
>
>   Grammar: '[' word ']'
>
> With input string "[foo]" it fails (returns nil).  If I remove the path
> token from the grammar, it works and returns ({ "[", "foo", "]" }) as
> expected.
>
> It's probably my lacking an understanding of how the internals work, but
> I thought the parser would only try to match against a token if your
> production rules mention it?

The parser will try to match a token if there is a token rule for it.
There are cases where a token match is actually supposed to block any
possible production rule match, so the parser does not require that
all tokens are used in production tules.

> I'm guessing it's because path can match '[', 'foo' and ']' (which is
> longer than the nothing the word token will match, and the docs say
> the longest regex will be matched) so the parser thinks the string
> starts with a path token, which means when the production rule asks
> for '[' it will already have been taken by the path token.

This is correct.

If you do not want the path token to match "[foo]", you should refine
its definition to exclude this.  If you want context-sensitive token
matching, you should parse in multiple passes with different grammars,
where "[foo]" may in a secondary staged be parsed as "[", "foo", "]".

Regards,
Dworkin
_________________________________________________________________
List config page:  http://list.imaginary.com/mailman/listinfo/dgd