[DGD] Working with parse_string()

Shentino shentino at gmail.com
Tue Jul 21 13:42:02 CEST 2009


On Tue, Jul 21, 2009 at 4:19 AM, Felix A. Croes <felix at dworkin.nl> wrote:

> Shentino <shentino at gmail.com> wrote:
>
> > This reminds me of another couple of questions I've wanted to ask:
> >
> > 1.  Apart from padding a grammar with ambiguity catching error cases, is
> > there any way to extract diagnostic information from a bad parse?
> >
> > So far, I can think of tracking the successful fragment parses as
> "progress
> > points"
>
> Adding rules to handle common errors is what you should do.  Though it
> may work for some grammars, in general you can't depend on diagnostic
> information gathered in LPC functions called by parse_string(), since
> several functions may be called for the same tokens as the parser tries
> to fit them to different rules, and there is no guarantee that the last
> function called is the most informative one.


Are there any guarantees about which order they will be called in?


> > 2.  How do you handle a grammar parse where the grammar, the sentence, or
> > both exceed the limits of string size, without hacking DGD and upping the
> > limits.
>
> This doesn't make sense.  How do I multiple two numbers, 3 and 4, with
> neither number larger than 2?


Um, ya lost me.

What I meant was what do you do if the grammar and/or sentence to be parsed
exceed 64K.

Sorta like how you went from string to string * for compiling, perhaps
someday in the future the same can be done for parse_string.  So um...yes,
it was a suggestion and not a question. :P

 > A case in point for this would be parsing a very large XML document that
> was
> > larger than 64K
>
> As long as the XML document fits in a string, I don't see why it couldn't
> be parsed.


But in my case it wouldn't fit.

Hence my previous statements.

Increasing string size limits is a choice, but I'm somewhat leery about
that, since it could easily become a DoS vector.

Preprocessing stuff into smaller chunks just seems a bit hackish.  It might
be a solution, but non-trivial.


>  Regards,
> Felix Croes
> ___________________________________________
> https://mail.dworkin.nl/mailman/listinfo/dgd
>



More information about the DGD mailing list