[DGD] Parse_string question

S. Foley s_d_foley at hotmail.com
Tue Feb 12 12:14:55 CET 2002


Your grammar was:

Noah Lee Gibbs <angelbob at monkeyspeak.com> wrote:
>regularstring = /[^~{}\\]*/
>
>unq_document: unq_string
>
>substring: regularstring
>substring: regularstring '\\' '{' ? concat_string_char
>substring: regularstring '\\' '}' ? concat_string_char
>substring: regularstring '\\' '~' ? concat_string_char
>substring: regularstring '\\' '\\' ? concat_string_char
>substring: substring regularstring ? concat_strings
>
>unq_tag: '{' substring '}' ? anon_tag
>unq_tag: '~' regularstring '{' substring '}' ? named_tag
>unq_tag: '{' unq_string '}' ? anon_tag
>unq_tag: '~' regularstring '{' unq_string '}' ? named_tag
>
>unq_string: substring ? simple_anon_tag
>unq_string: unq_tag
>unq_string: unq_string substring ? concat_ustring_string
>unq_string: unq_string unq_tag

I'll help as much as I can.  First off, the source of your ambiguity with 
respect to the input string given.  You have the following 3 production 
rules:

a) unq_string: substring ? simple_anon_tag
b) unq_tag: '~' regular_string '{' substring '}' ? named_tag
c) unq_tag: '~' regularstring '{' unq_string '}' ? named_tag

The ambiguity exists because you can get to b either by using b, or using c 
in conjuction with a.  The solution might be to get rid of production rule 
b.  It's hard to say without knowing what the lpc functions are doing.

Another ambiguity:

a) unq_tag: '{' substring '}' ? anon_tag
b) unq_tag: '{' unq_string '}' ? anon_tag
c) unq_string: substring ? simple_anon_tag

You can get to a also by application of b and c.  The solution here might be 
to eliminate production rule a, though again you might not be able to do 
that and still craft the array in the manner you want.  Again, hard to know 
without knowing how you want the return array crafted.

I think you can get around your token problem by just writing a better 
regular expression.  The one I came up with is:

regularstring = /((\\\\{)|(\\\\})|(\\\\~)|(\\\\\\\\)|([^~{}\\\\]))*/

I'm not sure this suits your purposes, as I'm not 100% sure what you 
intended the concat_string_char function to do, though I have a good guess.  
I wrote up a quick grammar for this, but I'm not sure if this suits your 
purposes.  I suspect tacking on functions in the appropriate production 
rules might get you where you want to go... but maybe not.  Again, I'm not 
certain about how you wanted to craft the return array.  Anyway, it can't 
hurt to offer one up to you.

regularstring = /((\\\\{)|(\\\\})|(\\\\~)|(\\\\\\\\)|([^~{}\\\\]))*/

unq_document: unq_string
unq_tag: '{' unq_string '}'
unq_tag: '~' regularstring '{' unq_string '}'
unq_string: regularstring
unq_string: unq_tag
unq_string: unq_string regularstring
unq_string: unq_string unq_tag

The rewritten regexp gets rid of the need to concatenate strings via lpc 
functions.  Eliminating the production rules mentioned above seemed to get 
rid of the ambiguity, at least with respect to the test string you gave.

I'm not sure this helps at all, but I figured it was worth a shot.

Good luck.  Maybe someone else on the list will help you further along than 
I can.  Perhaps if you give greater detail concerning what these LPC 
functions are doing, a sample input string, and the exact return value you'd 
like to see for that input string, I (or, better yet, someone a little 
smarter) could help you better.

--Steve

_________________________________________________________________
MSN Photos is the easiest way to share and print your photos: 
http://photos.msn.com/support/worldwide.aspx

_________________________________________________________________
List config page:  http://list.imaginary.com/mailman/listinfo/dgd



More information about the DGD mailing list