[DGD] Working with parse_string()
bart at wotf.org
bart at wotf.org
Fri Jul 24 15:53:45 CEST 2009
On Thu, 23 Jul 2009 18:30:20 -0700, Shentino wrote
>
> What are the rules of grammar for the langage in question that you
> are parsing?
>
> My hypothesis is that it has to be simple enough for humans.
>
While I don't know Polish, a little investigation turns up the following:
Polish has seven cases of declination, and then seven more for the plural of a
noun. Some of those cases look identical.
On top of that, consonants at the end of a word stem often change depending on
declination, even for regular nouns.
So, it is somewhat possible to derive a case from the stem based on rules, but
going the other way is very difficult without having some kind of lexicon.
Humans who know the language tend to have sufficient vocabulary to recognize
the word stem and derive what case is being used. Also, humans are somewhat
good with patterns in a way a computer generally isn't :)
One thing I have been thinking is that you'll need to differentiate between
invalid parsing and not being able to resolve the object.
'adjective adjective' would be a case of invalid parsing
'adjective noun' would be a valid parsing, regardless of being able to resolve
this to a specific object.
An object rule should contain a noun, prepended by optional adjectives.
This would require being able to determine if a word is a noun (without trying
to resolve it to a specific object).
This would look like
OBJ: opt_adjectives noun
noun: word ? test_noun
opt_adjectives:
opt_adjectives: opt_adjectives adjective
adjective: word ? test_adjective
mixed * test_noun(mixed * arg) {
/* looks up if arg[0] is a noun */
}
etc.
This all comes down to needing some kind of lexicon for lookups. You could
build that on the fly from names used by objects in your game, or try to find
a free one for Polish that can at least tell you if a word is a (declination
of a) noun.
I'd think that in order to match player input against object names, you need
to be able to recognize the different forms of a noun that could be used by
the player and resolve them to the object's name anyway, which would for all I
can tell require the same information.
Bart
--
Created with Open WebMail at http://www.bartsplace.net/
Read my weblog at http://soapbox.bartsplace.net/
More information about the DGD
mailing list