[MUD-Dev] byte-code anyone?

Chris Gray cg at ami-cg.GraySage.Edmonton.AB.CA
Mon Feb 16 09:47:20 CET 1998


[Jon Leonard:]

:I've been implementing something similar in the MUD I'm writing, but I
:realized fairly early on that I could get even better speedups by compiling
:all the way to machine code instead of just to bytecode.  (What JIT runtimes
:do for Java)

I thought about that. It's still a possibility, but I probably won't need
it. I would do direct code generation within my system, rather than go via
an external compiler. I would have to *really* want it, however, since doing
it that way is inherently non-portable. Since most of the things that a
MUD scenario does (string operations, DB access, NL parsing, dynamic
calls, etc.) are fairly expensive, most MUD code would see little benefit
from going to native code. The particular code I'm interested in would,
but its hard to justify the effort and non-portability just for that.

:My recommendations would be more complete if I were done, but here's the
:advice I'm giving myself:
:
:1) Test the virtual machine carefully starting early on, so fewer bugs need
:  to be laboriously traced by single-stepping an interpreter.

Hmm. I'd planned on testing by fire. My byte-code machine is going to be
quite low level, so visual inspection is feasible. Also, it would be extra
work to create an 'assembler' for it.

:2) Design so that no matter how perverse the byte codes you get fed are,
:  you don't crash or violate security assumptions.  Going from an interpreted
:  language to byte codes or machine code makes demonstrating correctness
:  much harder.

That would slow the system down too much. I've found, for example, that
even having a 'default' in the byte-code case statement adds noticeable
overhead. I have made a simple disassembler that lets me see the byte-
code for functions. Since the only source of byte-code will be the simple
compiler, my current view is that checking the byte-code is not needed.
There will be some processing required to set it up for execution (linking
to other bytecode), but I think that's about it.

You have reminded me that I forgot about my security model in this,
however. I might just punt by deciding that any function that changes
the security settings from that of its caller will not be compiled. I
have the luxury of falling back on the full parse tree interpreter.

Why does going from an interpreted language make things harder? In my case,
it makes things a lot easier. I already have correct parse trees for the
code, and can just traverse them to emit the code. My language is strongly
typed, so I don't have any of those nasty issues to worry about. The one
non-strongly typed construct (call of action retrieved as a value, rather
than by name), can be handled, or, I might just disallow compilation of
routines that do that. I can do that, since I still have the original
parse tree interpreter.

:3) Pick carefully what the simple primitives are.  Missing a vital one
:  can cost lots of performance, but too many bloats your
:  interpreter/compiler/security model.

Agreed. I've been fiddling as I go along. I expect I'll end up with around
100 opcodes (considerably less than Java has).

:4) Don't expect to be done soon.  A compiler can be a lot of work.

Ah, but most of it is already done. I expect to be running some simple
byte-code today. I haven't even specified all of the byte-codes yet, but
doing things incrementally lets me cycle the design with little cost. As
long as I have enough for at least function entry and exit, I can run things.

Oh yes, AmigaOS has the 'load' and 'unload' calls, so the scheme of
compiling via an external compiler could be made to work. The big issue
would be resolution of references to the already-loaded code. You'd have
to do that with a jump vector passed to the loaded code. (I did that on
a game system on CP/M, and I also use it in my current AmigaMUD system.)

--
Chris Gray   cg at ami-cg.GraySage.Edmonton.AB.CA



More information about the mud-dev-archive mailing list