[MUD-Dev] TECH: AmigaMud DB questions

Sat Apr 21 11:03:38 CEST 2001

> From: Bruce <bruce at puremagic.com>

[Warning, long answer, with actual code in it!]

> Is your description of your DB in:

>   http://www.kanga.nu/archives/MUD-Dev-L/1998Q2/msg00018.php

> still accurate?  How has that worked out for you?

Yes, that's still accurate. Most of the recent work has involved the port
of the server, the addition of the support for the new client stuff, and
some fairly extensive modifications to the scenario code to integrate the
tile-based mode with the older room-based mode.

> The DB is something that I've been experimenting with a lot lately
> inside of Cold.  One of the first things to go was the use of DBM for
> maintaining the index because it simply didn't scale.  I'm still not
> entirely thrilled with Berkeley DB, which I used to replace it and
> will probably end up moving over to something simple and similar to
> what you described.  Relying on other libs where speed is critical
> hasn't worked well for us.

Most traditional simple DB's are not good at random accesses to randomly
sized entries. I knew from day 1 that most of my database objects would be
of differing (and varying at runtime) sizes. So, I just went ahead and
produced the kind of back-end I needed. It's had some pretty nasty bugs
over time, but I haven't noticed any for quite a while. All of the original
writing of my MUD system was done on the Amiga, and using my own Draco
programming language, so pretty much any external package would have had
to have been translated and ported before I could use it.

> Do you still maintain refcounts for objects external to them in a
> mapping?  Does the user have to write code that correctly manages the
> refcounts on objects, or is that handled as part of the scripting
> language?

The object refcounts are officially stored in the object records in the
database (in a small fixed header piece). The extra in-memory caches are
used only to reduce the number of writes to the database itself. All of
the refcounts (on quite a few kinds of entities) are invisible to the
user. Two of the "dump out an entry" commands allow you to see the refcounts,
but there is no way to actually change them - it is all done automatically
as part of the lower-level routines that the MUD languages uses to
manipulate them. Similarly, when the refcounts go to 0, the system will
free the used resources.

> Does your LRU cache have multiple buckets or just a single linked
> list? (I'd like to understand what DGD does here at some point as
> well.) Do you really flush the whole cache to disk when you have no
> empty cache slots available? (If so, why the whole cache?)

Just a single linked list. There is only one set of forward/backward
links in the in-memory copy of each entry. The up-front mapping tables
can thus only point to one entry per table slot.

Yes, I flush the whole cache. Have some code: :-)

    /* portion the dirty entries into the buckets based on their file offset */

    ca = CacheLRUHead;
    while (ca != P_NULL) {
	next = ca->ca_next;
	if ((ca->ca_slotLength & CA_DIRTY) != 0) {
	    ix = IndexPtr + (ca->ca_key & INDEX_MASK);
	    offset = ix->ix_offset;
	    if ((offset & IX_FREE) != 0) {
		logX("object ", ca->ca_key, " is dirty, but free in index\n");
		localAbort();
	    }
	    offset >>= shift;
	    ca->ca_next = radixTable[offset];
	    radixTable[offset] = ca;
	}
	ca = next;
    }

    /* go flush each bucket (an n**2 operation) */

    for (i = 0; i < SORT_BUCKETS; i += 1) {
	ca = radixTable[i];
	if (ca != P_NULL) {
	    ioFlushBucket(ca);
	}
    }

Why? Well, so far it hasn't been a big issue. Certainly as the working
set size increased, one would want to change it. Also, the CPU time of doing
the flush isn't that great, so doing async I/O for the actual flushes might
hide the latency. That would require some additional data structures,
however, to keep track of what all is being written. The main saving, I
expect, is that its only a small percentage of the database that ever
gets dirtied permanently enough to actually need writing to disk.

> How do you handle backups?  I'm not entirely happy with the amount of
> time they take in Cold ... although, I think that with some tuning,
> they could definitely perform better than they do now by copying more
> blocks at once (Cold's DB has a minimal block size, defaulting to 256
> bytes). The asynch part of our backup is pretty nice though, although
> it had a data-corrupting bug in it that was hard enough to tickle that
> it wasn't noticed for about 4-5 years.  Of course, it only corrupted
> the backup DB, not the running one. It'd also be nice to have a way to
> provide transactional capability .. but that's not happening within
> Cold anytime soon.  Transactional stuff is the only reason that I'm
> jealous of DGD. :)

I ended up making backups visible to the players. Have some more code, this
time MUD-code:

define tp_baseVerbs proc DoBackup(thing obj)void:
    int saveLimit;

    /* We want to schedule the next backup before we actually do this one,
       so that the timekeeper will have us on its list in the backup! */
    if GlobalThing at p_BackupInterval ~= 0 then
	DoAfter(GlobalThing at p_BackupInterval * 60, GlobalThing, DoBackup);
    fi;
    if GlobalThing at p_FlushNeeded then
	saveLimit := RunLimit(100);
	/* clear flag *before* doing backup, so are clean after restore */
	GlobalThing at p_FlushNeeded := false;
	Log("Starting automatic backup, interval = " +
	    IntToString(GlobalThing at p_BackupInterval) + " minutes\n");
	APrint("\n * Timed database backup starting - please wait. *\n");
	Flush();
/*	Execute("copy MUD.#? backup"); */
	Execute("cp MUD.* backup");
	APrint(" * Backup complete. *\n\n");
	ignore RunLimit(saveLimit);
    fi;
corp;

So, it essentially just uses a server-provided "Flush" routine which flushes
everything through the various caches, and then flushes the database. After
that, it copies the database files to a backup sub-directory. One could get
smarter and use multiple backup directories. The key here is to stop any
changes to the database while the 'Flush' is running, and to stop any
database writes while the file copy is going on.

Is a synchronous backup like this scalable? Probably not in the long run,
but you'd have to be running a serious system in order to want to risk
corrupted databases to avoid the backups impacting the players. The players
seeing this ('APrint' is a broadcast to players) breaks the illusion of a
virtual world, but as a player, I think I would find it re-assuring.

> Another interesting topic is the conversion of your in-memory objects
> to the buffer that you store on disk .. that's kinda slow in Cold at
> the moment, but I'm too tired to ask questions about that yet. :)

Since the original version was done on the Amiga, I chose big-endian as
the standard for the database (and for messages). So, on that platform,
there really isn't any conversion necessary. On the X86 platform, I have
to do endian-conversions. There really aren't any other conversions that
I have to do - I arranged all of the types so that they pack properly into
structures, and just read and write the structures from the database.

There are conversions for some of the higher-level server types that are
stored in the database (MUD-code, tables, grammars), but they are not
read or written very often, since they each have independent in-memory
caches in front of the database itself.

> Are there good open source/free software lightweight OODBs around,
> preferably with a C API?  I've not seen one and I keep getting tempted
> to rip that layer out of Cold and package it up separately as a
> lib. It'd be nice if any suggestions were actively developed as
> well. :)

That question I'll leave to others, since I'm not knowledgeable there.

--
Don't design inefficiency in - it'll happen in the implementation.

Chris Gray     cg at ami-cg.GraySage.COM
               http://www.GraySage.COM/cg/
_______________________________________________
MUD-Dev mailing list
MUD-Dev at kanga.nu
https://www.kanga.nu/lists/listinfo/mud-dev