[MUD-Dev] UDP Revisted
Bruce Mitchener
bruce at puremagic.com
Thu Oct 25 00:05:03 CEST 2001
Bobby,
Thanks for the response. :)
Bobby Martin wrote:
>> From: Bruce Mitchener <bruce at puremagic.com>
>> I was curious as to what sorts of advanced features you had in
>> ARMI and the documentation didn't really go into much detail
>> beyond some usage notes.
>> Do you have any support for pipelining of requests?
> No, although TCP (which I currently use) buffers packets for up to
> 200 ms before it sends the buffer.
Mmmm, this isn't quite what I was thinking about when I wrote that.
If I fire off 3 message sends in a row, and TCP is the transport in
use, what happens? Do all 3 get sent to the remote end right after
each other and then responses are returned, tagged with some request
identifier as they happen? Is there an ordering imposed on the
return of requests? (Say that you send off messages 1 and 3 to
object A, and message 2 to object B. Object B takes a long time in
responding. Does that delay the return of the result for #3?)
Or, similar to that, what happens if one method call returns a very
large bunch of data. Will that hold up following method sends, or
do you break data into chunks similar to what BEEP and HTTP/1.1 can
do to multiplex a single channel?
Finally (for now :)), do you share the transport connection, or do
you establish new ones frequently (like per-object, per-method send,
etc)?
But, on the point that you brought up, do you have control over that
200 ms buffering from Java? (To turn off nageling or whatever you
might want to change.) Is the 200ms delay proving troublesome at
all? That seems like a fairly high penalty to pay in terms of
latency for any distributed communication.
>> How about for monitoring message traffic or requests for
>> debugging, checking load balancing, etc?
> Currently in the works, but not there right now.
What sort of stuff is in the works? :)
>> What sorts of security or authentication do you support?
> Currently, none :( It is fairly trivial to add an encrypting
> message filter, though. You can filter messages that you send any
> way you like.
What sort of model are you thinking about though? Is this mainly for
communication between trusted peers? Or might it be used in a more
hostile environment with foreign (or user-written) code involved?
>> Can you compare your packing overhead to that of RMI or other
>> systems? Do you compress your datastreams?
> We don't currently formally compress data streams. I have done
> some sampling of RMI throughpuut versus ARMI throughput; ARMI was
> about 10 times smaller for the method calls I tested. This
> definitely needs more formal testing.
and later in your post, but moved here for ease in replying ...
> I have implemented production compression systems before, and I
> can tell you that you will almost certainly need tailor made
> compression (i.e. just write clever serialization code) for any
> significant gains. Lots of small information packets are hard to
> compress, unless you take large production runs and analyze them,
> then compress assuming the usage patterns don't change too much.
> Most compression mechanisms require that every block of data you
> compress has a table describing how to decompress it, and for
> small blocks the table takes up more space than the amount you
> gain by compressing. If you analyze large production runs of
> data, the tables can be static and just sit on either end of the
> connection.
With respect to serialization strategy being more important than a
traditional compression algorithm, I definitely agree. Brad Roberts
and I made some changes to Cold about 2-3 months ago that changed
how we serialized some bits of data and objects for storage into the
DB. Those changes managed to take a DB that was roughly 2G and drop
it to about 1.6G in size, which was quite nice. (It also had the
interesting impact of reducing our fragmentation of free space on
disk as more objects were now small enough to fit in the minimum
block size, consuming more of bits of free space floating about that
previously used to just contribute to wasted space. Those gains
aren't accounted for in the numbers that I gave above.)
Regarding throughput, you say that messages were smaller. Were the
round-trip times also less? Or roughly similar?
>> Any more information to explain in more detail how/why ARMI is
>> great is welcome. :)
> Offhand I don't know what I can tell you about why ARMI is great
> that isn't listed on the web site. Main attributes are:
> 1) Your method calls can be asynchronous (your app doesn't stop
> and wait for the return value from a method call until you need
> the return value.)
Will you provide some higher level support to be used by a caller of
an asynchronous method to help deal with the failure cases that
might result, or any sort of syntactic sugar/glue to help deal with
the complexities that arise from a more asynchronous model?
Some cool work in this that has been done is in the E programming
language. (Disclaimer: that work probably is from predecessors to E,
but that was my first exposure to this and I can't credibly cite
which predecessor it might have been.)
Taking from the E in a Walnut draft book, and the chapter on
Distributed Co
mputing, located at http://www.skyhunter.com/marcs/ewalnut.html#SEC18:
-- begin quote --
All distributed computing in E starts with the eventually
operator, "<-":
# E syntax
car <- moveTo(2,3)
println("car will eventually move to 2,3. But not yet.")
The first statement can be read as, "car, eventually,
please moveTo(2,3)". As soon as this eventual send has
been made to the car, the program immediately moves on
to the next sequential statement. The program does not
wait for the car to move. Indeed, as discussed further
under Game Turns below, it is guaranteed that the
following statement (println in this case) will execute
before the car moves, even if the car is in fact running
on the same computer as this code.
--- end quote ---
and ....
-- begin quote --
When you make an eventual send to an object (referred to
hereafter simply as a send, which contrasts with a call
to a local object that waits for the action to complete),
even though the action may not occur for a long time, you
immediately get back a promise for the result of the action:
# E syntax
def carVow := carMaker <- new("Mercedes")
carVow <- moveTo(2,3)
In this example, we have sent the carMaker object the
"new(name)" message. Eventually the carMaker will create
the new car on the same computer where the carMaker
resides; in the meantime, we get back a promise for
the car. We can make eventual sends to the promise
just as if it were indeed the car, but we cannot make
immediate calls to the promise even if the carMaker
(and therefore the car it creates) actually live on
the same computer with the program. To make immediate
calls on the promised object, you must set up an action
to occur when the promise resolves, using a when-catch
construct:
# E syntax
def temperatureVow := carVow <- getEngineTemperature()
when (temperatureVow) -> done(temperature) {
println("The temperature of the car engine is: " + temperature)
} catch prob {
println("Could not get engine temperature, error:" + prob)
}
println("execution of the when-catch waits for resolution
of the promise,")
println ("but the program moves on immediately to these printlns")
We can read the when-catch statement as, "when the promise
for a temperature becomes done, and therefore the temperature
is locally available, perform the main action block... but
if something goes wrong, catch the problem in variable
prob and perform the problem block". In this example,
we have requested the engine temperature from the
carPromise. Eventually the carPromise resolves to a
car (possibly remote) and receives the request for
engine temperature; then eventually the temperaturePromise
resolves into an actual temperature. The when-catch
construct waits for the temperature to resolve into
an actual (integer) temperature, but only the when-catch
construct waits (i.e., the when catch will execute
later, out of order, when the promise resolves). The
program itself does not wait: rather, it proceeds on,
with methodical determination, to the next statement
following the when-catch.
Inside the when-catch statement, we say that the promise
has been resolved. A resolved promise is either fulfilled
or broken. If the promise is fulfilled, the main body of
the when-catch is activated. If the promise was broken,
the catch clause is activated.
Notice that after temperatureVow resolves to temperature
the when clause treats temperature as a local object. This
is always safe to do when using vows. A vow is a reference
that always resolves to a local object. In the example,
variable names suffixed with "vow" remind us and warn
others that this reference is a vow. We can reliably
make eventual sends on the vow and immediate calls
on what the vow resolves to, but we cannot reliably
make immediate calls to the vow itself. A more detailed
discussion of these types of warnings can be found in
the Naming Conventions section later.
--- end quote ---
E takes a very thought out and considered approach to distributed
computing (especially of the variety where untrusted code may be
involved). But, were you looking at providing any similar sort of
mechanism for simplifying working with asynchronous method sends?
I've found that the addition of syntactic sugar for handling
asynchronous operations can even be useful when distributed
computations aren't being used. In TEC, we've got a scripting
language that GMs use for some things that compiles to ColdC. I've
taken advantage of that to provide some fairly high level approaches
to solving problems that then compile into the lower-level, messy
code. (And we lack a lot of the complexity of the environment that
E is designed for, but some of the same principles are paying off
well for us.)
> 2) I use ids for the class and method instead of the huge
> descriptors that RMI uses.
Does that change the security profile of your protocol at all? (Is
there a good reason that RMI uses large descriptors?)
> 4) <not implemented yet> You can choose whether a method call is
> guaranteed or not. Currently you either use UDP for transport
> and all method calls are non-guaranteed but fast, or you use TCP
> for tranport and all method calls are guaranteed but slower (but
> still much faster than RMI).
How will you do this? Will it be something that is decided at a
method call site? Or will it be per-method (and globally decided)?
If the latter, will you have some sort of IDL that provides that
info, a registration process, or something else?
That sort of thing is where I think that a language with
extensible parsing/keywords is really useful. Taking it back to
persistence for a moment, you might invent your own keywords to
specify persistence behaviors. Or for distributed computing, you
might just tag your source with directives on how that variable
should be updated over the network.
I'm not a big fan of requiring a whole new IDL or something for
stuff like that, or a specialized program like OpenSkies uses for
their object definitions. But, then, I guess not many mainstream
programming languages have an extensible grammar
- Bruce
_______________________________________________
MUD-Dev mailing list
MUD-Dev at kanga.nu
https://www.kanga.nu/lists/listinfo/mud-dev
More information about the mud-dev-archive
mailing list