[MUD-Dev] Distributed State Systems

Wed Sep 1 13:38:40 CEST 2004

On Mon, 30 Aug 2004 16:37:31 -0400, Michael Tindal
<mtindal at paradoxpoint.com> wrote:

> Currently what I'm looking at is a central data repository with a
> way for nodes to store/retrieve data, and some form of a data
> cache for each node to minimize round trips.  As the connections
> scale and more nodes are querying for data and storing data in the
> repository the time needed for the data repository to keep up with
> each node increases, thereby increasing latency.

> Are there any ways to ensure that the world stays as consistent as
> possible despite the number of nodes and/or connections? (The idea
> is to keep the system indefinitely scalable, one of my major goals
> in its design).  I'm not sure how the central data repository
> would scale, since every data I/O request is an inter-thread (or
> inter-process) call, the repository has to queue requests until it
> is capable of fulfilling them.  Any suggestions on how to avoid
> this while still maintaining consistency?

Are you using a shared-everything, load-balancing server pool, or a
cluster where every N nodes manage a zone (in load balanced or
failover fashion)?

In case of shared-everything, it seems to me that DB access and
cache coherence will be your biggest bottlenecks (and they might be
_quite_ big). In that case, you can try to use a COMA-like
architecture. The principles:

  - Data access is always via a local cache. The backing store for
  the cache is a DB, of course.

  - When retrieving data from the DB (the "cache line"), the DB
  entry is tagged with the ID of the node which caches it.  - Nodes
  accessing data which is cached in another node must go to that
  node and retrieve the data from it. You must implement locks on
  the cached data for performing transactions.

Anyway, shared-everything seems like a serious bother, and limits
scalability (unless data is practically always "unshared"). You
either need to go to the DB (or another node) for data, or cache
aggressively and hope your data sets are relatively exclusively
accessed.

For the other option (failover etc.), your post seems to imply you'd
have that solved rather easily. One method that we used at one of my
workplaces was to keep a "standby" server for each live one
(handling all connections over a specific subset of N subscribers);
All operations were done using a local, simple DB (similar to DBM,
simple file-based tables/ISAM). Every DB modification/transaction
was sent over the net to the standby server. When the live server
went down, the standby server would have been sufficiently up to
date to continue the work, since it would boot with a DB that was
up-to-date to the last X milliseconds. Now, this scheme cost up to
15% in CPU time (completely unoptimized, mind you), but failover was
practically instantaneous and the implementation was very simple
(gateway + config files + raw binary net protocol + primitive DB =
profit).

There are other schemes that can be implemented, but in general in
seems to me that partitioning the world into zones and moving
objects between them is the most productive method. Some sort of
interesting handover mechanism might be implementer in case the
world is supposed to be contiguous, and you might need to constrain
the number of live objects (mobs + players) in a zone, but these are
solvable (and in the case of shared-everything, your servers will
get bogged down with locking in case of too many players in the same
area anyway - and in other cases too).

Are any of you Industry Types willing to share your experience in
implementing MMO server farms? What is the MMO builder's zeitgeist?
:)
_______________________________________________
MUD-Dev mailing list
MUD-Dev at kanga.nu
https://www.kanga.nu/lists/listinfo/mud-dev