[MUD-Dev] C&C and Event Rescheduling

Wed Jul 30 15:37:12 CEST 1997

> In <199707261742.TAA00453 at regoc.srce.hr>, on 07/26/97 
>    at 10:53 AM, silovic at srce.hr (Miroslav Silovic) said:

> commit as I do.  This could allow a failed event to die and reschedule
> much sooner than if it had to wait for the other contender to commit

This was my intention.

> Hurm.
> 
> It also changes the character fo the DB accesses.  In the lockless
> model events compete to C&C successfully.  Events with smaller
> workings have an advantage at C&C time, as do events with shorter run
> times.  Its a question of who is in and out first.  Neither of those
> points are true for your model.  For you, more or less, the first to
> reach the base has it, holds it, and is guaranteed to commit
> successfully (ignoring questions of needing to access an object which
> another already has locked, which is not a safe assumption as below). 

I consider this a good thing. Remember, you have change_parents function
in Cold. When you call it, it guarantees that the database will be
consistant after termination. The problem is that it means that
constructor/destructor pairs get called on /all/ descendants of the
affected object. Adding a parent to the user object, in other words,
will touch all the users. This means that such task has to run with VERY
high priority (and in my scheme, I intend to knock the locks off the
lower-priority tasks, precisely because of this). Now, this takes about
20-30 seconds on a LARGE db (say, 5,000 users), if the parent change is
significant. This means that if lower-priority tasks could kill this
at will, it'd take 5-10 minutes for it to grow high enough in priority
to actually complete, according to your scheme. In my scheme, it'd get
max priority, and all threads in its way would restart, but it'd complete
in reasonable time without affecting the game too much.

IMO, the events roughly fall into two categories: database recompilation
(change_parents is one such), and game events. I think that game events
roughy take equal ammount of time, and giving priority to the shorter
event is actually big potential problem: If events take 20 and 200 ms
respectively, total processing, when 30 ms even is rescheduled, is
20+2*200 = 420 ms, instead of 220ms. On the other hand, database
recompilations are so unpredictable that they're best run with high
priority or atomically, simply because that way the probability that
something will break is the lowest.

> Another potential problem is that your locking model opens itself to
> infinite loops about the Dining Philosophers problem.  Without some
> very interesting deadlock detection it is quite possible to have (say)
> a minimal set of 3 events each of which attempts to lock various
> objects and all of which get killed/rolled back because one of the
> other two events already has the requested resource.  Consider:
> 
>   Objects A, B and C,
> 
>   Events X, Y and Z.
> 
>   X locks A.
> 
>   Y locks B.
>   
>   Z locks C
> 
>   X attempts to lock B and dies.
> 
>   Y attempts to lock C and dies.
>   
>   X restarts and locks A.
> 
>   Z attempts to lock A and dies.
> 
>   Y restarts and locks B.
> 
>   X attempts to lock B and dies.
> 
>   Z restarts and locks C.
> 
>   Y attempts to lock C and dies.
> 
>   ...etc.
> 
> This is a very simplistic case.  Add many more events, and have each
> event attempt to access more than one object (5?  10?) and it becomes
> almost probable.

Actually I intend to reschedule the tasks with random delays, precisely
because of this (I believe this solution is standard). Deadlock is
improbable, then, because sooner or later things will schedule in the
right order, with proper pause between each.

> Aside: Its not distributable, well, not in any sane manner, whereas
> the lockless model rolls out there pretty easily, especially if you
> allow objects to migrate between DB servers.

My method would support RPC model of distribution, with remote proxies
(Cold already supports proxy objects, but they are not all that useful
without proper thread safety). For that matter, I'm still split between
RPC and replication in choosing the right model for distribution.
Replication bombs the bandwidth with flinging the objects around,
while RPC needs to interchange several remote-call packets to do anything.
On the other hand, replication would fare REALLY badly if two servers
do intense manipulation of the same object for any length of time.

	Miro