[DGD] Code libraries for DGD and Hydra

Felix A. Croes felix at dworkin.nl
Wed Sep 19 06:13:15 CEST 2018


When writing code for Hydra, one has to keep one thing in mind: the
current task may fail to commit, and have to be rescheduled, if one
or more of the objects accessed by the current task were modified by
another task, running concurrently.

This can be prevented by keeping the number of objects accessed by
the current task low.  Tasks that access many objects can be broken
up into smaller subtasks, chained together by callouts.

In Hydra, a callout added to an object does not count as a
modification of that object, as long as the state of the object (the
variables and other callouts) is not read or changed from the same
task.  The cloud library simplifies this with the call_out_other
function:

    call_out_other(obj, func, args...)

Any externally callable function in an object can be called in this
way.  The callout is always started with delay 0.

Multiple callouts can be added by a task, even to the same object.
For instance, a message could be passed to all users in a game
(potentially affecting a very large number of objects, which normally
would be very likely to fail to commit) with call_out_other(),
without increasing the likelyhood of the task to fail.

As a special case, consider an object which is central in some way,
and which has to be updated often by a variety of tasks.  Normally,
such updates would run a high risk of causing the task to fail to
commit.  But when call_out_other() is used, the calls are indirect,
and the central object can process the update from the
callout-started task.  Hydra can efficiently manage a large number of
pending callouts per object.

Call_out_other() can also be used to implement the map/reduce
coding pattern.  One object chops up a large dataset, adds callouts
with chunks of the data to a large number of other objects, those
objects process the chunk of data within the callout-started task,
and then report back to the originating object using another callout.
The originating object aggregates the results step by step, until all
the results are in.

There is one caveat.  Naturally, using status(obj) counts as
accessing the objects state, and therefore it is one of the things
that call_out_other() is not doing.  But status(obj) includes
information about callouts, and querying this state will explicitly
cause a collision with any changes to those callouts made by other
tasks.  Thus, status(obj) and status(obj)[O_CALLOUTS] become very
expensive operations.

Other than that, call_out_other() makes tasks composable on Hydra,
in a manner that prevents collisions between tasks.

Regards,
Felix Croes



More information about the DGD mailing list