[MUD-Dev] ColdStore. Belated response from a developer.

Sat Apr 22 10:19:59 CEST 2000

I write this in response to a thread from December 1999, of which I've just 
become aware.  In the thread, Miro <silovic#zesoi,fer.hr> attempts some 
helpful criticism of ColdStore, which I seek to rebut here.

Miro writes, of ColdStore,

> It's a C++ library that mmaps a file, and then handles the allocation and refcounting. This is very fast, of course, as the allocation handles the locality, but that's the only good thing one can say this approach. Problems:

>- mmap is limited by the architecture. With normal databases you can download 
as needed, and the db file(s) can be bigger than the address space - if you 
carefully split your db, for example. With mmap, you're limited to the largest 
contiguous segment in your address space, which really is about 1 gb on most 
OSes.

That's true.  64bit machines are around the corner: when the machine's address 
space exceeds your available disk storage this limitation disappears.

>This seems a lot but I know of at least one production MUD whose db size hit 
1gb at one point (that's when they rewrote much of their code for data size 
efficiency) - and that was Cold db that compresses small integers, so it'd be 
1.5-2gb of pure mmapped binary.

ColdStore compresses small integers too.

I'd have thought the storage between Cold and ColdStore was comparable.

>- bug recovery is nonexistent. If your broken C++ code hits mmaped memory, 
you can say goodbye to the entire db.

ColdStore doesn't set out to make C++ programming safe, because that's 
impossible.  It sets out to provide a toolkit for supporting higher-level 
interpreted languages in which one would write applications.

It's certainly necessary to thoroughly engineer the C++ classes you use and to 
avoid world-stopping bugs, however Miro's argument could apply to any complex 
piece of code upon which your world depends, the o/s, the db backend.

You could look at ColdStore as a C++-extensible database engine which has been 
optimised for the kinds of storage one would require for MUDs.  Certainly, if 
the database is buggy your MUD will be toast.  Recommend: don't extend the db 
with buggy code.

>- versioning issues. While upgrading ColdStore may be handled, If you change 
layout of your objects, well... conversion is impossible, since you've just 
mmaped the binary.

ColdStore defines a virtual protocol based roughly on Python's API.  One of 
the extensions is a toconstructor() call, which is intended to be used to 
serialise objects for the purpose of reconstruction.

>Note that mmapped is okay if you make it possible to serialize the object in 
some intelligent way and then copy them into the mmapped memory.

One could certainly use the toconstructor() facility for changing the raw 
layout of objects.

This would be painful, however, and one would be well advised to think through 
the raw layout of lowest level objects before committing them to persistent 
store.

>- db format is highly non-portable. It's not just architecture dependant, 
it's C++ compiler dependant (as object layout may change).

Certainly.  It was a design decision to localise the cost of moving between 
architectures to the time when one might wish to actually move them, rather 
than seek to support an abstracted/portable format for which one pays on each 
access.

Many database formats are not immediately/binary portable, usually the 
databases have to be serialised or marshalled before moving them.

>- refcounting. Well. Some people swear by it. Usually the same people who 
haven't explored its interactions with threading.

I was in the process of writing it right out in favour of GC, when someone 
pointed out to me that refcounting is a great storage optimiser - ColdStore 
collections and vector types share all substructure, and implement Copy On 
Write semantics on low level objects.

> The issue is that increasing refcount needs to be done atomically - when you have LOTS of threads, well... that just won't work.

Of course it will work.  It just entails some possible contention.  Most 
modern (like, post Z80 :) CPUs provide for atomic increment/decrement of 
integers because, as Miro says, there's a lot of contention for counters.

>You either have to use extra spin lock or use inline assembly for atomic 
increment. And since you increment/decrement refcounts a LOT (even passing 
pointers as parameters may cause refcounts to bump), this is major issue.

I don't see the problem with using inline assembly for atomic increments, 
that's what inline assembly is for.  It's what we use, nicely wrapped in a 
Counter class.

Maximising data locality by sharing substructure will always tend to increase 
contention between threads at the same time as it tends to decrease page 
faults.  This is a classic trade-off: throughput vs. responsiveness.  Are 
networks really fast enough, yet, that response-lag is an issue?

As long as the time taken to swap a page from disk is much longer than a 
thread context switch, I'll tend to worry more about locality than thread 
performance, most particularly since most of large MUD architectures don't 
even have real multitasking.

>BTW, has there been any discussion on the list on incremental/generational 
GC? (that works best for MUDs by far, IMHO - you can really bump up your 
realtime response once you implement incrementality properly).

As mentioned, while concurrent GC is a really interesting field, refcounting 
is not merely a garbage collection facility.

>- no way to logically split your db into multiple files. ColdStore stores 
direct binary pointers.

Later versions may provide for a tiling of mmaps over the address space.

> You also have Texas Persistent Store [...] ColdStore authors chose not to implement this because, according to them, it performs really slowly.

Yes, according to reported benchmarks it performs 10,000 times slower than the 
equivalent direct pointer access methods.

Additionally, the authors of Texas don't answer email, the build process 
entails heavy use of a hacked gdb (using STABS to generate object 
interpretation code for swizzling) and Texas has no support for dynamically 
loading new objects into a store.

>I'd say [ColdStore] is great for small MUDs that need to run on tiny (read: 
outdated) servers and aren't changed a lot after the initial hardcode is in 
place.

I'd never seek to write any MUD but a trivial MUD directly on ColdStore layer1 
object protocol, but then that's not what ColdStore was designed for.

I think layer1 is a fairly good base for implementing virtual machines, at 
what we've called layer2, and would welcome criticism of ColdStore with this 
intended use in mind.

We're currently developing languages and their associated virtual machines at 
layer2, I think it would make sense to use the languages being built on layer2 
to implement a MUD-like system.

>It's faster than anything that uses separate db engine and it handles small 
dbs efficiently. But I think it'd be a mistake to use it for a (potentially) 
large size world.

I don't consider 1.75Gb to be small, on a 32bit machine.  And given our 
current rate of progress, 64bit systems will be prevalent by the time we 
deliver our first Dispute Resolution System object :)

Colin.

_______________________________________________
MUD-Dev mailing list
MUD-Dev at kanga.nu
http://www.kanga.nu/lists/listinfo/mud-dev