C&C and Event Rescheduling

Wed Jul 30 17:28:21 CEST 1997

> In <199707261742.TAA00453 at regoc.srce.hr>, on 07/26/97 
>    at 10:53 AM, silovic at srce.hr (Miroslav Silovic) said:

> commit as I do.  This could allow a failed event to die and reschedule
> much sooner than if it had to wait for the other contender to commit

This was my intention.

> Hurm.
> 
> It also changes the character fo the DB accesses.  In the lockless
> model events compete to C&C successfully.  Events with smaller
> workings have an advantage at C&C time, as do events with shorter run
> times.  Its a question of who is in and out first.  Neither of those
> points are true for your model.  For you, more or less, the first to
> reach the base has it, holds it, and is guaranteed to commit
> successfully (ignoring questions of needing to access an object which
> another already has locked, which is not a safe assumption as below). 

I consider this a good thing. Remember, you have change_parents
function in Cold. When you call it, it guarantees that the database
will be consistant after termination. The problem is that it means
that constructor/destructor pairs get called on /all/ descendants of
the affected object. Adding a parent to the user object, in other
words, will touch all the users. This means that such task has to run
with VERY high priority (and in my scheme, I intend to knock the locks
off the lower-priority tasks, precisely because of this). Now, this
takes about 20-30 seconds on a LARGE db (say, 5,000 users), if the
parent change is significant. This means that if lower-priority tasks
could kill this at will, it'd take 5-10 minutes for it to grow high
enough in priority to actually complete, according to your scheme. In
my scheme, it'd get max priority, and all threads in its way would
restart, but it'd complete in reasonable time without affecting the
game too much.

IMO, the events roughly fall into two categories: database
recompilation (change_parents is one such), and game events. I think
that game events roughy take equal ammount of time, and giving
priority to the shorter event is actually big potential problem: If
events take 20 and 200 ms respectively, total processing, when 30 ms
even is rescheduled, is 20+2*200 = 420 ms, instead of 220ms. On the
other hand, database recompilations are so unpredictable that they're
best run with high priority or atomically, simply because that way the
probability that something will break is the lowest.

> Another potential problem is that your locking model opens itself to
> infinite loops about the Dining Philosophers problem.  Without some
> very interesting deadlock detection it is quite possible to have (say)
> a minimal set of 3 events each of which attempts to lock various
> objects and all of which get killed/rolled back because one of the
> other two events already has the requested resource.  Consider:
> 
>   Objects A, B and C,
> 
>   Events X, Y and Z.
> 
>   X locks A.
> 
>   Y locks B.
>   
>   Z locks C
> 
>   X attempts to lock B and dies.
> 
>   Y attempts to lock C and dies.
>   
>   X restarts and locks A.
> 
>   Z attempts to lock A and dies.
> 
>   Y restarts and locks B.
> 
>   X attempts to lock B and dies.
> 
>   Z restarts and locks C.
> 
>   Y attempts to lock C and dies.
> 
>   ...etc.
> 
> This is a very simplistic case.  Add many more events, and have each
> event attempt to access more than one object (5?  10?) and it becomes
> almost probable.

Actually I intend to reschedule the tasks with random delays,
precisely because of this (I believe this solution is standard).
Deadlock is improbable, then, because sooner or later things will
schedule in the right order, with proper pause between each.

> Aside: Its not distributable, well, not in any sane manner, whereas
> the lockless model rolls out there pretty easily, especially if you
> allow objects to migrate between DB servers.

My method would support RPC model of distribution, with remote proxies
(Cold already supports proxy objects, but they are not all that useful
without proper thread safety). For that matter, I'm still split
between RPC and replication in choosing the right model for
distribution. Replication bombs the bandwidth with flinging the
objects around, while RPC needs to interchange several remote-call
packets to do anything. On the other hand, replication would fare
REALLY badly if two servers do intense manipulation of the same object
for any length of time.

	Miro