[MUD-Dev] ColdStore. Belated response from a developer.

colin at field.medicine.adelaide.edu.au colin at field.medicine.adelaide.edu.au
Mon Apr 24 18:43:09 CEST 2000


> Miroslav Silovic <silovic at zesoi.fer.hr> writes:
> colin at field.medicine.adelaide.edu.au writes:
> 
> > ColdStore compresses small integers too.
> 
> Could you post a bit more about ColdStore internals on this list?

ColdStore's a layered system.  At layer0 you have a locale-based allocator 
which works fairly hard to keep regions allocated to a given locale on a 
minimal number of pages (theory being to reduce hard page faults.)  At layer1, 
you have Yet Another Class Library, with collection objects, BigInts, Strings, 
whatever else you believe should be a primitive datatype for whatever you're 
building above it.  At layer2, are objects designed to support language 
interpretation and such.

>From layer1 up, the objects inherit from a primitive Data class which exports 
a virtual protocol based broadly on Python's API.  The idea here is that you 
write any useful datatype which you want to use above here, and coerce its 
interface to look like Data's, and have immediate access to it from languages 
implemented at layer2.  Layer1 objects written to date have been optimised 
wherever possible to use layer0's allocation tactics to maximise spatial 
locality of reference.

Layer1 uses a Slot smartpointer to reference Data instances, maintaining 
refcounts and such.  It additionally exploits the fact that qvmm longword 
aligns all allocations to tag short integers (31bits) and transparently 
converts them to pool-allocated Integers when they need to be operated on.

> I was under impression that it used plain virtual memory to talk to
> hardware (thus getting speed).
> 
> > I'd have thought the storage between Cold and ColdStore was
> > comparable.

I meant the amount of space between Genesis and ColdStore would be 
approximately comparable, unless you've changed the marshalling code from the 
original coldmud (and it was pretty optimal.)

> The last version of Cold storage is my work, and one I'm not
> particularily proud of (okay, this is an understatement. In actuality,
> it's a brown paper bag quality code - I want to wear one over my head
> when I think about certain things in there). :)
> 
> > ColdStore doesn't set out to make C++ programming safe, because
> > that's impossible.  It sets out to provide a toolkit for supporting
> > higher-level interpreted languages in which one would write
> > applications.
> 
> The website suggested it was C++ persistent store system, I was
> replying under that assumption.

It is a C++ persistent store, with all the baggage that C++ brings.

> > It's certainly necessary to thoroughly engineer the C++ classes you
> > use and to avoid world-stopping bugs, however Miro's argument could
> > apply to any complex piece of code upon which your world depends,
> > the o/s, the db backend.
> 
> 'Don't write buggy code' is not something I'd recommend to a
> programmer and expect gratitude. :)

Again, if there's a bug somewhere in (say) MOO server, or in Genesis server, 
or in FreeBSD or Linux or Solaris o/s, it's going to have an adverse impact on 
the service run ... ColdStore C++ objects are at that level, below the 
application level.

> > One could certainly use the toconstructor() facility for changing
> > the raw layout of objects.
> > 
> > This would be painful, however, and one would be well advised to
> > think through the raw layout of lowest level objects before
> > committing them to persistent store.
> 
> Assuming that ColdStore is not meant to store C++ objects, but only be
> used with an interpreter to keep it in check, this makes sense.

ColdStore can be used at layer0, to store C++ objects, or at layer1 to mediate 
between the storage of C++ objects and the provision of those C++ objects to 
an interpreted language.  It's really up to the application developer.  I'd 
probably recommend either the use of an interpreted language, or the use of 
Java or something to generate the native code used to manipulate those objects 
stored in C++.

> > >- db format is highly non-portable. It's not just architecture dependant, 
> > it's C++ compiler dependant (as object layout may change).
> > 
> > Certainly.  It was a design decision to localise the cost of moving
> > between architectures to the time when one might wish to actually
> > move them, rather than seek to support an abstracted/portable format
> > for which one pays on each access.
> > 
> > Many database formats are not immediately/binary portable, usually the 
> > databases have to be serialised or marshalled before moving them.
> 
> Does ColdStore support this, or is it supposed to be provided by an
> application sitting on top of it?

I've followed this convention in all the layer1 objects I've written, the 
toconstructor() method is present in the virtual protocol supported by layer1, 
if people want to write objects which they don't want to serialise, that's 
their lookout.  I probably wouldn't choose to use them, though.

Another alternative presenting itself more recently is the use of the openc++ 
preprocessor (which we use) to autogenerate serialising/deserialising code.  
It's possible, but it's a long way from the top of my priority queue.

> > I was in the process of writing it right out in favour of GC, when
> > someone pointed out to me that refcounting is a great storage
> > optimiser - ColdStore collections and vector types share all
> > substructure, and implement Copy On Write semantics on low level
> > objects.
>
> This is acceptable if you already decided to pay the penalty for
> having an interpreter in the first place.

Sure.  Look, I personally would write C++ directly to the store, because I'm 
really comfortable with it.  I like it.  I wrote a lot of it ...  On the other 
hand, I really think you could do worse than using some interpreted language 
to implement the functionality you want, work out what needs to be written 
more efficiently, and then sink it lower down, in C++.  <-- The usual 
scripting language justification.

> > I don't see the problem with using inline assembly for atomic
> > increments, that's what inline assembly is for.  It's what we use,
> > nicely wrapped in a Counter class.
> 
> *.......* (insert any number of sounds here)

(insert colorful finger gestures here)

> > Maximising data locality by sharing substructure will always tend to
> > increase contention between threads at the same time as it tends to
> > decrease page faults.  This is a classic trade-off: throughput
> > vs. responsiveness.  Are networks really fast enough, yet, that
> > response-lag is an issue?
> > 
> > As long as the time taken to swap a page from disk is much longer
> > than a thread context switch, I'll tend to worry more about locality
> > than thread performance, most particularly since most of large MUD
> > architectures don't even have real multitasking.
> 
> Okay, I think this is the central issue.
> 
> Yes, textual MUDs are bandwidth-bound. However, this list is not only
> about textual MUDs. Also, I think that bandwidth-bound MUDs are not
> interesting any more (because even MOO, once that memory became cheap,
> CPUs became fast, and codebase became slightly more optimised than it
> used to be, is now bandwidth-bound - so why bother inventing a better
> mousetrap?)

I didn't really mean throughput as in network bandwidth, but more disk swap 
bandwidth.

ColdStore arose from an observation that coldmud was reputed to be faster than 
MOO even though coldmud had to go through a serialisation/deserialisation 
whenever objects were swapped between disk and memory.  We concluded that the 
disparity was due to an accidental feature of coldmud, that it would unpack 
objects into a minimal set of pages in the process address space, while MOO 
(seemingly by design) scatters its object across the whole address space.

The analysis lead us to suppose that maximising spatial locality of reference 
was a highly desirable performance enhancement.

I guess I'm contending that it doesn't matter how little contention there is 
for shared resources, if those resources are scattered all over the disk in 
little pieces which have to be reassembled while the threads wait for them to 
be swapped back in.

> Now, if you have a predictive client (one that keeps some state, and
> knows how to evolve and display it), you can just send diffs from the
> server, saving LOTS of bandwidth. In that case, the interesting things
> become possible, but the server bottleneck shifts from bandwidth to
> processing. Refer to this list's archive for examples.

> Notice about paging - you have to write to the page each time you
> increase the refcount, and refcount may get increased just by passing
> pointers around.

Writing to a page doesn't necessarily entail writing it back to disk ASAP.  If 
it does happen to be so in existing kernels, then I guess I'll have to modify 
the kernel.

> This is different from GC schemes, where you only
> touch the pages you really want to write to (the tradeoff being the
> need for write barrier, with any decent GC). Then there is robustness
> of GC compared to refcounting (as reported by people involved with
> Mozilla project). Refcounting certainly makes sense if the ammount of
> the code in native language (C++, in this case) is small - but I don't
> like assuming that for a low level system I'd like to see in wide use.

Yes.  We're trying to minimise low level code.

> > Additionally, the authors of Texas don't answer email, the build
> > process entails heavy use of a hacked gdb (using STABS to generate
> > object interpretation code for swizzling) and Texas has no support
> > for dynamically loading new objects into a store.
> 
> I actually meant to say that Texas is a cool concept whose
> implementation desperately needs a complete rewrite, preferably by a
> team of programmers (rather than a team of computer scientists). :)

Computer scientists, having proven the concept, move on and leave we poor 
proggers to pick up the pieces.

Anyway, it occurred to me after discussing your comments with others on the 
ColdStore team (skeptopotamus, specifically) that it'd be entirely possible to 
use the Texas swizzling concept to provide a much larger second-level store, 
with the unswizzled objects relocated to reside in ColdStore (as a kind of 
cache.)

What it'd buy you is a much larger address space - Texas object ids are 
32bits, but can be mapped onto a much larger store.  You'd still take the 
performance hit of the segv overhead, and unswizzling, but it'd be an 
interesting hybrid.

Perhaps you could do some kind of magical DB backing store and use coldstore 
as a general cache ... don't know, I'll jump that (1.75 Gb) barrier when I 
come to it.

> To conclude, within the assumptions you decided to work with,
> ColdStore seems cool (modulo my disagreement with use of inline
> assembly). But I tend to work under VERY different assumptions. :)

I'm all for mosaic solutions.  May your assumptions lead you to interesting new problems.

Colin.





_______________________________________________
MUD-Dev mailing list
MUD-Dev at kanga.nu
http://www.kanga.nu/lists/listinfo/mud-dev



More information about the mud-dev-archive mailing list