[DGD] DGD good choice for MUD with large-scale city simulation?

Mon Dec 10 20:59:15 CET 2018

On Sun, 09 Dec 2018 22:04:18 +0100, Felix A. Croes wrote
> bart at wotf.org wrote:
> 
> >[...]
> > This is not a trivial system, and took quite a lot of work to be able to
> > handle this, but... I think you might underestimate how much work can be done
> > with just a single threaded dgd. I'd assume hydra can be made to handle 10m+
> > objects with proper care and design.
> 
> Naturally I am pleased that DGD is holding up so well under heavy load!
> But although the word "active" is a bit vague, active as in taking part
> in a database query is not quite the same as actively participating 
> in a simulation, with the objective of producing emergent story lines.

Its certainly not the same thing. I'd imagine such a simulation would consist
of many objects that act independently, performing in itself small and simple
things, which combine in something bigger and much more complex and where
interaction between objects is mostly an 'accidental' side effect, whereas
something like lpcdb has a top-down model where things start with a 'compex'
task which gets split up in many simple tasks and has many objects acting in
concert.

> 
> I am becoming quite curious about lpcdb, however.

A long time ago I was looking at creating a releasable version, but I keep
getting caught up in rough ends, stubs that 'really' should at least get some
rudimentary implementation, and lack of time due to a very busy job.

(in a way that is funny because it started at some point by me throwing away
almost everything from gurbalib except for some bits of kernel and system
code, in an attempt to create a releasable version of my i3 router code... oh
well, I do need something to keep track of user accounts and settings... lets
add a small database... hey.. this is nice, I can also turn this into a more
real database... add another year and I ended up with a somewhat unusual and
rather complex codebase that is not exactly easy to understand, and the exact
opposite of what I set out to do)

Anyway, as to it using lots and lots of objects.. It uses a 64x64
(configurable) 'grid' of storage objects for each column and index (which is
just a modified column really). So a table with 10 columns will have 10x64x64
(+ a few for administrative purposes) objects. Add another 64x64 objects for
each index (+ a toplevel object for administrative purposes).

In most cases it won't need to go through all of those, but that specific
search I mentioned, searching the quotes database for the word 'all' is pretty
much a worst case scenario.

Text search uses 2 index levels. First of all there is a 'fragment' index, ie:
if you lookup the 'key' 'all' in that index, it will return a set of all words
that contain the sequence 'all'. It uses this set to look in the second index,
and get a set of rows with quotes/fortunes that contain those words, a
separate set for each word, which get combined with a simple | if the sets are
small enough.

Now, the sequence 'all' appears in lots of words, but not in so many that it
gets ignored, and hence results in many 2nd level index lookups, and many
result sets that get combined. In the end it still only retrieves each full
result once, but the number of 2nd level index lookups and set combines is
making it very expensive in comparison to what at first glance would be more
complex queries.

I've been thinking of 2 ways to address this scenario, a simple one is to put
a lower limit on which words or fragments get ignored due to being too
generic, but a better solution is ranking the results of the first index
lookup, and only consider the 100 or so highest ranked words.

Bart.
--
https://www.bartsplace.net/
https://wotf.org/
https://www.flickr.com/photos/mrobjective/