[MUD-Dev] TECH: AmigaMud DB questions

Wed Apr 25 00:18:12 CEST 2001

Chris Gray wrote:

>> From: Bruce <bruce at puremagic.com>

> [Warning, long answer, with actual code in it!]

Yay!

>> The DB is something that I've been experimenting with a lot lately
>> inside of Cold.  One of the first things to go was the use of DBM
>> for maintaining the index because it simply didn't scale.  I'm
>> still not entirely thrilled with Berkeley DB, which I used to
>> replace it and will probably end up moving over to something simple
>> and similar to what you described.  Relying on other libs where
>> speed is critical hasn't worked well for us.

> Most traditional simple DB's are not good at random accesses to
> randomly sized entries. I knew from day 1 that most of my database
> objects would be of differing (and varying at runtime) sizes. So, I
> just went ahead and produced the kind of back-end I needed. It's had
> some pretty nasty bugs over time, but I haven't noticed any for
> quite a while. All of the original writing of my MUD system was done
> on the Amiga, and using my own Draco programming language, so pretty
> much any external package would have had to have been translated and
> ported before I could use it.

Agreed.  This is roughly what I tell people when they ask "But why
don't you just use Berkeley DB for all of your storage?"  I was unable
to get BDB to insert a couple million -small- and constant sized
records in anything approaching acceptable lengths of time unless I
through a large cache at the DB. (About 145M seems to work fine.)
That bloats my process size though beyond what most people are willing
to accept, and I hadn't even tested storing large or randomly sized
objects in there. But maybe I was missing something wrt tuning BDB.
(I still see a high cost for our index operations that do use BDB
currently, since it was faster than DBM.)

>> Do you still maintain refcounts for objects external to them in a
>> mapping?  Does the user have to write code that correctly manages
>> the refcounts on objects, or is that handled as part of the
>> scripting language?

> The object refcounts are officially stored in the object records in
> the database (in a small fixed header piece). The extra in-memory
> caches are used only to reduce the number of writes to the database
> itself. All of the refcounts (on quite a few kinds of entities) are
> invisible to the user. Two of the "dump out an entry" commands allow
> you to see the refcounts, but there is no way to actually change
> them - it is all done automatically as part of the lower-level
> routines that the MUD languages uses to manipulate them. Similarly,
> when the refcounts go to 0, the system will free the used resources.

Ahhh, we don't refcount objects but require manual management of
object lifetimes.  We do have data that we store on objects though but
which doesn't cause them to get written back out should that data be
modified, since it is just for doing things while the object is
loaded, so we just don't flag the object as dirty.

> Why? Well, so far it hasn't been a big issue. Certainly as the
> working set size increased, one would want to change it. Also, the
> CPU time of doing the flush isn't that great, so doing async I/O for
> the actual flushes might hide the latency. That would require some
> additional data structures, however, to keep track of what all is
> being written. The main saving, I expect, is that its only a small
> percentage of the database that ever gets dirtied permanently enough
> to actually need writing to disk.

We track dirtied objects and only write back what we need.  We also
only write out objects when either they are forced out of the cache or
we're syncing to disk for a backup to occur.

Regarding the flush, I see a fairly reasonable amount of time being
spent in fflush.  When compiling a DB (about 1.5 gig) and doing an
initial population of a binary DB, we spend about 5.5 minutes in our
db_put() routine.  Of that, roughly 70 seconds is spent in fflush,
both directly and as a result of calling fseek.  I plan on looking at
shifting to asynch I/O experimentally to see how that might impact
both this and normal runtime operations.  Has anyone else tried this
and seen benefit while running under decent load?

> Is a synchronous backup like this scalable? Probably not in the long
> run, but you'd have to be running a serious system in order to want
> to risk corrupted databases to avoid the backups impacting the
> players. The players seeing this ('APrint' is a broadcast to
> players) breaks the illusion of a virtual world, but as a player, I
> think I would find it re-assuring.

It is also a nice enough way to explain why everything has halted for
some period of time. :) With a big DB and a not-so-far from what is
released version of Cold, the synchronous part of a backup can take a
couple of minutes before moving into the asynch part where players can
do things again.  For muds of the sizes that most people on mud-dev
are familiar with, the synchronous portion of the backup would be
seconds only.

> Since the original version was done on the Amiga, I chose big-endian
> as the standard for the database (and for messages). So, on that
> platform, there really isn't any conversion necessary. On the X86
> platform, I have to do endian-conversions. There really aren't any
> other conversions that I have to do - I arranged all of the types so
> that they pack properly into structures, and just read and write the
> structures from the database.

> There are conversions for some of the higher-level server types that
> are stored in the database (MUD-code, tables, grammars), but they
> are not read or written very often, since they each have independent
> in-memory caches in front of the database itself.

Ahhh, a very large difference from us. :) We have to do conversions to
serialize everything to disk. (In the above case where I gave numbers
from a profile of a DB compile, object serialization was almost 2
minutes of the 5.5 minutes spent putting objects into the DB.)  We
also store everything on objects, including the bytecode for methods.
We've given thought to caching methods separately, but haven't ever
finished the work in an experimental form to see what the gain might
be.

>> Are there good open source/free software lightweight OODBs around,
>> preferably with a C API?  I've not seen one and I keep getting
>> tempted to rip that layer out of Cold and package it up separately
>> as a lib. It'd be nice if any suggestions were actively developed
>> as well. :)

> That question I'll leave to others, since I'm not knowledgeable
> there.

I've looked on sourceforge, freshmeat and other places.  I found none
that matched what I already have in Cold and what you apparently have
in your system. :) I can't speak to how these compare with the systems
present in Muq or DGD.  It looks like it would make sense to separate
out the block storage layer from Cold and present it as a separate
library.  I could definitely use it in some of my more recent
projects.

Have you made any decisions yet on the status of your source?

  - Bruce

_______________________________________________
MUD-Dev mailing list
MUD-Dev at kanga.nu
https://www.kanga.nu/lists/listinfo/mud-dev