[DGD] parse_string() implementation of regex (was: Memory usage)

Tavis Elliott tavise at nwlink.com
Fri Mar 14 20:35:33 CET 2003


On Thu, 13 Mar 2003 23:50:23 +0100 (CET), Felix A. Croes <felix at dworkin.nl> 
wrote:

> [...]
> Or you can try to bite an even bigger bullet, and re-implement the
> regular expression functionality using parse_string().  That is not
> just possible, it should also be more efficient for long/frequent
> searches.
>
> This could be done through a central object which, using parse_string(),
> compiles regular expressions into a valid string parser grammar, which
> is then used in the object that called the regular expression function.
> A simple way to start would be to use parse_string()'s own regular
> expressions for tokens.

This seemed like a pretty good idea, so I took a whack at it.  I read all 
the parse_string docs I could find that came with the driver, as well as 
all the mails pertaining to parse_string in the mailing list archives (that 
are available).  (NOTE: I have no prior experience with natural language 
parsing)

Here's my first attempt, which doesn't seem to do what I want:

  grammar = "pattern = /"+ pattern +"/ " +
    "other = /./ "+
    "match : pattern";

Using a pattern of 'foo' on a string of 'foo' matches fine, but using a 
pattern of 'foo' on any other string (such as 'afoo') returns nil.

It seemed that the token 'pattern' would match whatever the pattern is, and 
the token 'other' would match everything else.  This would mean the token 
parsing wouldn't ever fail, since 'other' would match the whole string if 
'pattern' didn't match anything.  If no 'pattern' token was found, then 
parse_string() would return nil.

I feel I'm missing something major.

I noticed a note posted by Dworkin a couple years ago that this could be 
used to implement regex, and that anyone who did so should post their code 
on this list.  No code has been posted, but it seems hard to believe that 
no one has implemented this yet ... Everyone who is smarter than me, and 
who has done so, must be shy.  :)

Thanks in advance to anyone who feels like helping!

-T.

---------------------------------------------------
Busier than a one-legged Riverdancer.
_________________________________________________________________
List config page:  http://list.imaginary.com/mailman/listinfo/dgd



More information about the DGD mailing list