Oops! Was Re: [DGD] LPC algorithm question
Noah Gibbs
angelbob at monkeyspeak.com
Sat Feb 23 23:00:20 CET 2002
I'm wrong -- you can hash just fine on an LWO. I was just not
initializing my hash table properly!
Mea culpa. I always find it right after hitting "Send" :-)
--
angelbob at monkeyspeak.com
See my page of DGD documentation at
"http://www.angelbob.com/projects/DGD_Page.html"
If you post to the DGD list, you may see yourself there!
On Sat, 23 Feb 2002, Noah Gibbs wrote:
> I'm writing a function, filter, which looks like this:
>
> private mixed* filter(mixed* entries);
>
> Entries is an array, where each element looks something like this:
>
> ["linkdead", <phrase>, (stuff)]
>
> The first element, a string, is a keyword for a help entry. The second
> element, "<phrase>" is an LWO containing, among other things, an array of
> string sorted by locale (English, Spanish, etc). There is a one-to-many
> mapping between keywords and help entries, so for instance a particular
> help entry might be listed by the keywords "policy", "policies",
> "rule" and "rules". The <phrase> structure will be the same for all four
> entries for this help but the leading keyword string, the first element of
> an entry, will be different. In other words:
>
> ["policy", <phrase>, ...]
> ["policies", <phrase>, ...]
> ["rule", <phrase>, ...]
> ["rules", <phrase>, ...]
>
> Note <phrase> points to the same LWO in all four cases -- all of this is
> stored in the same heavyweight object (/usr/common/sys/helpd), so that
> should work fine.
>
> I want to take a list of entries like this and eliminate all duplicate
> <phrase> structures so that if the Soundex algorithm cheerfully came back
> with both "policy" and "policies" as a match for "help pollissee" only one
> of the two would be shown since they have the same text as their
> description.
>
> My first thought was to keep a hash table just within the function and
> hash each entry into the table using the phrase structure as a key, then
> just return the map_values array from the table. This doesn't work since
> you can't use an LWO as an index into a hash table. Hashing on the
> object_name of an LWO doesn't help. In C I'd have a pointer...
>
> I could just grab the English string out of each one in turn and use
> *that* as the hash key, though that will take more CPU since the help
> descriptions can be a page or more in length. It would also be more
> complex and error-prone since some entries may conceivably only exist in
> non-English languages so I'd need a number of lines of code to fish
> through the locales and come up with the right one to hash on in each
> case.
>
> Is there some easy way to do this without checking each one and
> eliminating the duplicates with a linear search? I'd like to avoid that
> since it would be O(n^2) time and I'm trying to add some more permissive
> Soundex matching which would return a much larger number of entries for
> sorting...
>
> --
> angelbob at monkeyspeak.com
> See my page of DGD documentation at
> "http://www.angelbob.com/projects/DGD_Page.html"
> If you post to the DGD list, you may see yourself there!
>
> _________________________________________________________________
> List config page: http://list.imaginary.com/mailman/listinfo/dgd
>
_________________________________________________________________
List config page: http://list.imaginary.com/mailman/listinfo/dgd
More information about the DGD
mailing list