[DGD] Parse_string question
S. Foley
s_d_foley at hotmail.com
Tue Feb 12 12:14:55 CET 2002
Your grammar was:
Noah Lee Gibbs <angelbob at monkeyspeak.com> wrote:
>regularstring = /[^~{}\\]*/
>
>unq_document: unq_string
>
>substring: regularstring
>substring: regularstring '\\' '{' ? concat_string_char
>substring: regularstring '\\' '}' ? concat_string_char
>substring: regularstring '\\' '~' ? concat_string_char
>substring: regularstring '\\' '\\' ? concat_string_char
>substring: substring regularstring ? concat_strings
>
>unq_tag: '{' substring '}' ? anon_tag
>unq_tag: '~' regularstring '{' substring '}' ? named_tag
>unq_tag: '{' unq_string '}' ? anon_tag
>unq_tag: '~' regularstring '{' unq_string '}' ? named_tag
>
>unq_string: substring ? simple_anon_tag
>unq_string: unq_tag
>unq_string: unq_string substring ? concat_ustring_string
>unq_string: unq_string unq_tag
I'll help as much as I can. First off, the source of your ambiguity with
respect to the input string given. You have the following 3 production
rules:
a) unq_string: substring ? simple_anon_tag
b) unq_tag: '~' regular_string '{' substring '}' ? named_tag
c) unq_tag: '~' regularstring '{' unq_string '}' ? named_tag
The ambiguity exists because you can get to b either by using b, or using c
in conjuction with a. The solution might be to get rid of production rule
b. It's hard to say without knowing what the lpc functions are doing.
Another ambiguity:
a) unq_tag: '{' substring '}' ? anon_tag
b) unq_tag: '{' unq_string '}' ? anon_tag
c) unq_string: substring ? simple_anon_tag
You can get to a also by application of b and c. The solution here might be
to eliminate production rule a, though again you might not be able to do
that and still craft the array in the manner you want. Again, hard to know
without knowing how you want the return array crafted.
I think you can get around your token problem by just writing a better
regular expression. The one I came up with is:
regularstring = /((\\\\{)|(\\\\})|(\\\\~)|(\\\\\\\\)|([^~{}\\\\]))*/
I'm not sure this suits your purposes, as I'm not 100% sure what you
intended the concat_string_char function to do, though I have a good guess.
I wrote up a quick grammar for this, but I'm not sure if this suits your
purposes. I suspect tacking on functions in the appropriate production
rules might get you where you want to go... but maybe not. Again, I'm not
certain about how you wanted to craft the return array. Anyway, it can't
hurt to offer one up to you.
regularstring = /((\\\\{)|(\\\\})|(\\\\~)|(\\\\\\\\)|([^~{}\\\\]))*/
unq_document: unq_string
unq_tag: '{' unq_string '}'
unq_tag: '~' regularstring '{' unq_string '}'
unq_string: regularstring
unq_string: unq_tag
unq_string: unq_string regularstring
unq_string: unq_string unq_tag
The rewritten regexp gets rid of the need to concatenate strings via lpc
functions. Eliminating the production rules mentioned above seemed to get
rid of the ambiguity, at least with respect to the test string you gave.
I'm not sure this helps at all, but I figured it was worth a shot.
Good luck. Maybe someone else on the list will help you further along than
I can. Perhaps if you give greater detail concerning what these LPC
functions are doing, a sample input string, and the exact return value you'd
like to see for that input string, I (or, better yet, someone a little
smarter) could help you better.
--Steve
_________________________________________________________________
MSN Photos is the easiest way to share and print your photos:
http://photos.msn.com/support/worldwide.aspx
_________________________________________________________________
List config page: http://list.imaginary.com/mailman/listinfo/dgd
More information about the DGD
mailing list