[MUD-Dev] Re: lockless system - foolproof?

J C Lawrence claw at kanga.nu
Sun Aug 30 22:49:57 CEST 1998


On Sun, 30 Aug 1998 20:36:11 -0400 
James Wilson<jwilson at rochester.rr.com> wrote:
> On Sun, 30 Aug 1998, J C Lawrence wrote:
>> On Sat, 29 Aug 1998 19:42:17 -0400 James
>> Wilson<jwilson at rochester.rr.com> wrote:

>> The easiest and perhaps the simplest way of attacking this is thru
>> approaching the level of parallelism in your execution model.
>> Again, taking the pessimal case, if the execution model degrades
>> until only one event is executing at a time, then X is guaranteed
>> to successfully C&C when its turn comes as there is no competition.
>> 
>> This is the approach I take.  Events enter the Executor with a
>> "priority" value.  As events fail C&C they accumulate priority.
>> The executor's execution model then adapts dynamically per the
>> highest priority event on its list.  At one end its the normal
>> unrestrained competition between events, and t the other it
>> degrades to single-tasking with everal stages in between.

> Are your events guaranteed to complete in bounded time? 

No.  I attempt (and fail) to guarantee correctness, not response time.

I do place a maximum cap on the execution time of non-priviledged
events however.  Should they require more execution time they fail by
exception.

> You mentioned that you are designing with user scripting in mind. If
> one of those contentious events ends up getting a monopoly on cpu
> time, *and* it's of long/indefinite duration (which would seem to
> correlate nicely with the propensity for contentiousness),
> everything would freeze waiting for it.

Pessimally, nearly yes.  In practice no.  The game-wide event thru-put
will (fairly rapidly) degrade to single-tasking, and then resume
immediately with normal processing.  Should the offensive event be a
raid respawner (ie there are many such events and/or each one creates
further like itself) which is the ultimately pessimal case, the game
will enter a yo-yo-like state of constantly entering single-tasking
mode and then leaving it.

However such excessively contentious events are easy to detect and
guard against.  Here, I (used to) run a little statistics monitor that
would note any events that took the game to or near single tasking
mode, and if the repeated more than once an hour, would spoof the
defining object and freeze the event calls for admin intervention.

  Aside: An optimisation and possible feature I've considered adding
is having the progression from single-tasking back to normal execution
be potentially graded on the basis of the past frequency of
single-tasking degredations.  If they system has been decaying to
singletasking a lot lately, then it would take its time returning to
maximal parallelisation.

I can make the system protect itself from idiots.  I cam make the
system resilient to attacks by idiots.  I'm unwilling to make a system
which is proof against idiots.

> While some might say it seems a bad idea to allow users to write
> long-running functions, a scripting system would seem to me to be
> most useful to builders wanting to add new functionality to the
> world without having to muck about in the source code. 

True, sorta.  The caveat is on the definition of "source code".  I
don't have that distinction.  The only "programmer" source (ie C/C++)
is in the base server, which knows nothing about games, players, NPCs,
game worlds, etc.  It merely knows about objects, network connections,
and an language byte-coding and execution, and ism when viewed from
inside the game, deliberately opaque and inviolate.  As such the
entire game world from the lowest base principles on up is written in
the "scripting language" per se.

That said, much like the old argument of C vs Pascal, free scripting
on MUDs is akin to handing chainguns with hari triggers to monkeys.

> In this case, putting bounds on running time would be a serious
> constraint on builder creativity. Ideally, bounding runtimes should
> be a matter of policy rather than forced by the implementation.

Disagreed, strongly.  The reason is actually simple: I find that
extremely long running or contentious events are actually of flawed
base design.  There is always another divisional structure that allows
the same result to be accomplished with less contention points, and
often with far greater elegance.

eg

  Assume a SHOUT command that passes its argument to all players and
NPCs.  The obvious (and actually fairly common) approach is something
as follows:

  for target in (players && NPCs)
    deliver string

This is an obvious high contention candidate as the working set by the
end of the event will be extremely large.  A possible low-contention
implementation, might be the following:

    1) Logs an event X which does nothing collect a list of all
players and NPC's (fairly trivial by traversing the child-list of the
relevant root objects, thus having a working set of only two objects),
and then, as part of its C&C logs further events, either one per
target object, or one per small sub-set of target objects (lists are
your friend).  This executes rapidly and has minimal contention.

  2) The secondary logged events then run and deliver the string to
the relevant player/NPC's, each with a total working set of only the
target object, and thus minimal contention.  These execute extremely
rapidly and have minimal contention.

Of course this also means that your execution model had better be
pretty efficient or you will spend all your time in the overhead of
building and tearing down events rathr than executing them.

This is the sort of things which I am unwilling to proof against.  I'm
willing to make the system withstand such onslaughts (monolithinc long
running events), which it currently does, but I am not willing to
require the system to perform well under such (it doesn't).

Comparison:

  I have (I wrote) a tool which processes debug log files and is quite
fast and efficient, as well as powerfully useful in post-mortem
debugging.  As it is so fast etc I tend use it heavily.  A critical
fact however is that this tool operates by loading its entire target
file into memory prior to processing -- which is one of the reasons it
is so fast, or at least until your debug log file size starts
approaching or exceeding your total system RAM size.  

  Will the tool still work when attempting to process an 8Gig debug
logfile on a system which only, say, 256Meg RAM?  Absolutely.  Just
don't expect any useful data back from it any time soon (I did this
once, accidentally).  If you do wait long enough however, it will
work, quite correctly, and generate the expected data.

  It is a design consideration.  Most of my log files don't approach
the system free RAM limit.  Ergo, the limit is "soft" and not of
concern.  Ditto for the behaviour of ungraceful events in my process
model.  They are capable of dragging the system to its knees, but they
are incapable of rendering the system either dead or non-progressive
(ie no events being processed).  I consider that an acceptable
trade-off.

--
J C Lawrence                               Internet: claw at null.net
----------(*)                              Internet: coder at ibm.net
...Honourary Member of Clan McFud -- Teamer's Avenging Monolith...




More information about the mud-dev-archive mailing list