[MUD-Dev] Re: lockless system - foolproof?

J C Lawrence claw at kanga.nu
Sun Aug 30 13:20:37 CEST 1998


On Sat, 29 Aug 1998 19:42:17 -0400 
James Wilson<jwilson at rochester.rr.com> wrote:

> In order to avoid the problems inherent in thread synchronization,
> (I think) JC has suggested a lockless system based on (what I call)
> per-object "cache invalidation", that is, writes to objects a thread
> is examining invalidate that thread's cache and hook a restart of
> the event being processed.  This is fine for single-object locking,
> but (as far as I understand it) does not address the case where an
> event must deal with multiple objects atomically without those
> objects mutating under it. 

No, you are missing a key point:  The only time that an "invalidation"
(as you term it) occurs is when an object successfully C&C's.  ie For
any event to be rescheduled there must have been a prior event which
successfully compleated (this ignores the minor question of internal
event failures and time-outs).

There is ___NO___ cross-effect between the object caches of competing
events until an event successfully C&C's.  This is extermely
important, and in fact a fundamental definition of the C&C model.
NOTHING exists fo access by anything else until an event successfully
C&C's.  To do otherwise would actively invite logical inconsistancies.

> In this case, the compound object would
> need to be locked, and all its components locked, in order to
> prevent concurrent accesses from constantly (and perhaps
> indefinitely) restarting the event that needs the whole object at
> once. 

Nope.  There is no locking.

> Please correct me if I'm wrong...

Welcome.

> The simple case is this: we have two objects, A and B, containing
> reference fields. The programmer would like to guarantee that, if A
> points to B, B will point back to A, and vice versa; that is, there
> will always be a cycle between them. A and B could be nodes in a
> doubly linked list, for instance, where
>
>    node->previous->next == node || 0 == node->previous
>
> and
>
>    node-> next->previous == node || 0 == node->next
>
> should always be true. If thread C grabs A and breaks its link to B,
> and if thread D grabs B before C gets a chance to break its link to
> A, how can be ensure that the constraint is not broken? 

Each thread operates on local copies of the objects.  The semantic
construct is copy-on-write.

To diagram your example:

  Event C starts.
  Event D starts.
  Event C accesses Object A.
    Event C gets a local reference pointer to A in the DB cache.
  Event D access Object B.
    Event D gets a local reference pointer to B in the DB cache.
  Event C modifies A.
    The lcoal reference pointer to A is destroyed and a new, local
        copy of A is made which is private to C.
  Event D modifies B.
    The lcoal reference pointer to B is destroyed and a new, local
        copy of B is made which is private to D.
  Event C accesses B.
    Event C gets a local reference pointer to B in the DB cache.
  Event D accesses A.
    Event D gets a local reference pointer to A in the DB cache.
  Event C modifies B.
    The lcoal reference pointer to B is destroyed and a new, local
        copy of B is made which is private to C.
  Event D modifies A.
    The lcoal reference pointer to A is destroyed and a new, local
        copy of A is made which is private to D.
  Event C attempts to C&C.
    Event C has object A and B in its local private modified list.
    Both A & B have not been modified in the external DB since C
        starteed execution.
    C's changes to A and B are commited to the DB.
    All of this is done atomically.
    This is a succssful C&C.
  Event D attempts to C&C.
    Event D has object A and B in its local private modified list.
    Both A & B have been modified in the external DB since D
        starteed execution (te recent shcanges made by C).
    D fails C&C.
    The local copies of A abd B are discarded and D is rescheduled.

An optimisation of this last step has the successful C&C of C note
what objects have had new states committed, and on that basis what
executing events have been given references to those objects.  The C&C
mechanism them destroys and reschedules all those events as they are
now guaranteed to fail their upcoming C&C attempts.
  
> one solution could be: C doesn't save its change to A to the
> database until it has also changed B. When it does save changes, it
> will then need an atomic operation to save A and B simultaneously.

Bingo.

> Assuming this solution (please let me know if there are any other
> solutions), the possibility arises of having to collect a large
> number of such database commits due to some operation on a complex
> compound object, where guaranteeing the atomicity of access to
> subobjects is not sufficient. Then a thread which tries to grab all
> the objects in question, operate upon them in such a way that the
> overall constraints are satisfied, and save the changes, would be
> vulnerable to frequent, possibly endless restarts as one or another
> of the many objects it is trying to keep coherent is changed.

True.  

The pessimal case:

  Event X modifies the very large set of objects A thru Z.

  Event Y modifes the single object Q, which a a member of the set
modified by X.

  Event Y is rapidly respawned, each time modifying Q.

  Event X attempts to run but is constantly forestalled by the fact
that one or more iterations of Y have successfully C&C'ed during X's
execution, thos preventing X from successfully C&C'ing.

Left alone X will never successfully C&C.

The easiest and perhaps the simplest way of attacking this is thru
approaching the level of parallelism in your execution model.  Again,
taking the pessimal case, if the execution model degrades until only
one event is executing at a time, then X is guaranteed to successfully
C&C when its turn comes as there is no competition.

This is the approach I take.  Events enter the Executor with a
"priority" value.  As events fail C&C they accumulate priority.  The
executor's execution model then adapts dynamically per the highest
priority event on its list.  At one end its the normal unrestrained
competition between events, and t the other it degrades to
single-tasking with everal stages in between.

> Someone suggested that after several cache invalidations the thread
> might try a different tack, explicitly locking the structure it is
> manipulating and all the objects underneath it. This brings us right
> back to what we wanted to avoid, locking and the concomitant
> possibility of deadlock.

Quite.  It also invalidates the entire model.

--
J C Lawrence                               Internet: claw at null.net
----------(*)                              Internet: coder at ibm.net
...Honourary Member of Clan McFud -- Teamer's Avenging Monolith...




More information about the mud-dev-archive mailing list