[MUD-Dev] Fw: lurker emerges

James Wilson jwilson at rochester.rr.com
Sun Aug 9 11:15:14 CEST 1998


-----Original Message-----
From: James Wilson <jwilson at rochester.rr.com>
To: mud-dev at kanga.nu <mud-dev at kanga.nu>
Date: Sunday, August 09, 1998 2:06 AM
Subject: lurker emerges


[snip]


>As I see it, not having gone the whole nine and tested it out, using a
>single, select()ing thread to do all your stuff would work fine if each
>operation is bounded by some small
>amount of time. That is, if you spend too much time in processing the
>received action,
>your responsiveness to pending requests goes down. I'm not sure how or if
>this issue is solved in the select()-based http servers, and am looking at
>the source code to try to suss it out. How do they deal with a request for
a
>bigass file? Do all the ripe sockets wait to be select()ed while the bigass
>file is sent on its merry way? News at 11.


update: thttpd uses select() for multiplexing its output as well as its
input. Thus the bigass file's transmission is interleaved with all other
sent files.

>The simplest way of serializing mud processes would be to lock the database
>and heap of in-memory mud objects for every transaction. It seems to me
this
>would reduce the system to one essentially identical to the single-threaded
>select() server. The lockless system described in the FAQ could be an
>improvement on this. In this system, access to the database is atomic,
while
>in-memory objects are thread-local and competing modifications get resolved
>with a repeat of the modifying event. This pushes the serialization to
>different places, namely the point at which the database is read to
generate
>the local copies and the point at which the database is written and checked
>for discrepancies. This is still an improvement because the lockless system
>allows concurrent processing once objects have been read in from the
>database. (I am still a little fuzzy on some details - how is the
collection
>of objects to 'clone' determined?  Does the cloning thread save two clones
>of the object, one 'as it was read in', and one 'production copy'? If not,
>how does it know the difference between the object's state-at-snapshot and
>the object's current database state (which might reflect modifications by
>other threads)? If a thread grabs an object and runs for a long time, will
>it see modifications to that object, or work with the old copy?)


I retract these questions after rereading the FAQ. Instead, let me ask:

1. if one obtains an object which contains a reference to another, and
then obtains the referent, does this require two DB hits? If so, perhaps a
prefetching strategy could be used: requesting object foo gets you foo
together with everything it points to (but nothing beyond that). This could
be a loss if the larger set incurs extra collisions with other threads to
the
extent that any gains from prefetching are swamped.

2. how can one use this system to guarantee correctness when dealing with
compound
objects? For example, suppose that A points to B and B points to A, and
further you would like to ensure that if at any time A no longer points to
B, it will
also be true that B no longer points to A, and vice versa. Imagine a
parent-child
relationship, where 'A child-of B' iff 'B parent-of A'. Now suppose thread
T1 obtains a reference to A with the understanding that this invariant
holds. Simultaneously, T2 obtains B and changes it so it no longer points to
A. If T2 commits B to the
database before committing A, the invariant is broken - T1 has a copy of A
which points to B, which does not point to A. This would be a race condition
(or a "dragon's
door", as it has been termed here). If on the other hand T2 grabs A as well,
breaks the link to B, and commits A before B, again the database will
violate
the invariant; while T1 in this example would be safe because its cache
would be
invalidated, a third thread T3 holding a clone of B would be endangered in
the
mirror version of T1 with A in the first case. Finally, there could be a
provision for
multiple simultaneous commits, which would correctly invalidate the caches
of T1 and
T3 and ensure the large-scale correctness of the data structure. If the
large-scale
structure is large, however, it would be quite possible that an event which
needed
to access every component and commit them all en masse could be constantly
aborted by commits to the components from other threads. That is, suppose
that in the
case above, T2 reads in A and B and modifies them before attempting a
simultaneous
commit. Meanwhile T1 and T3 are regularly modifying A and B, such that T2 is
always interrupted before it can finish its works with both and commit them.
Granted,
this is highly unlikely if you are talking about only two objects, but
suppose you
were dealing with a large data structure whose components are regularly
being
modified by other threads? How could you be sure you would eventually get to
commit your changes?

James







More information about the mud-dev-archive mailing list