[MUD-Dev] Mud Languages

Sat Aug 9 04:51:16 CEST 1997

> 
> One of my contributors mentioned how much they liked the 'scoping' of 
> Pascal a week or two ago. Things like strongly typed, weakly typed - what 
> does this actually *mean*? Is it simply that the types of function params 
> are specified in advance? Or am I thinking along the wrong lines? Or is 
> it that there is a bit more to it than I am aware of?

Err, okay. Strong and weak typing are actually wrong terms, in my
opinion. The correct terms are static vs dynamic typing. In static
typing, /all/ the types are know at compile time. This is usually
implemented by requiring that all the variables be declared - one
of the exceptions is ML, which requires only enough declarations that
the rest of the types can be deduced (and the compiler is amazingly
good at deducing things for you). With dynamic typing, types are
known at runtime. Efficient compilers are often able to deduce
a /lot/ about your code, and optimize it by converting to static
typing. On the other hand, a freedom to use dynamic typing is,
IMHO, very important for rapid code design (people who never tried
it seriously will probably disagree).

> Are there any specific things that I should be aware of, when designing a 
> language? (I am not interested in implementation issues atm.)

Design is far more important, because if you take your time to look
at the articles on the compiler theory on the net, you'll find out
that pretty much /every/ design can be implemented within 1/3 speed
of C (except tcl, but now they're compiling even /that/). 

There are many issues. Many people think that design is tradeoff
between efficiency and ease of use. Actually design is a tradeoff
between compiler complexity and the ease of use. :) I'll come to the
issues in a bit.

> > I've used quite a few (as, I suspect, have a few of the other older
> > members of this list), including APL, AlgolW, Algol68, Pascal, Lisp,
> > Basic, Fortran, Forth, several assemblers, and a few languages of my
> > own design. 
> 
> Of these languages, which features do you like? Which don't you like? Why?
> 
> Is it possible to say, in general terms, what makes a bad programming 
> language? And what makes a good one?

At the risk of starting a flamewar, there are very few languages which are bad
in general. I'd say that the /really/ bad feature is failure
to check the types of function arguments (either at compile time or
at runtime). This is my major gripe with FORTRAN and K&R C (thankfully,
they fixed this in ANSI C).

Another think that I hate is high sensitivity to the error of the programmer.
C++ is particularily affected by this - just try to play with templates
a bit and watch the code-size explode. Error tolerance is probably the
single most important issue to deal with. This means that errors, when
they occur, should be meaningfully reported. But it also means that
logical errors should be relatively rare.

> I know someone who is looking into dynamic linking of object files into a 
> running executable. Has anyone here done anything like this?

Cold people consider doing this. There is another possibility: There is
a reasonably portable code generator that spews machine code directly
into RAM. See http://www.pdos.lcs.mit.edu/~engler/pldi96-abstract.html

> > If the language you create/use is familiar to a lot of your workers, then
> > it is easier to get them to work in it, since there is less learning
> > curve, and there is the "instant gratification" that can be so important
> > in getting people motivated. I essentially did that with my system, but
> > since I knew that I was going to be the only user for a while, I chose
> > to make my language like other languages I had designed, and not like
> > any mainstream languages. My dislike for many mainstream languages, and
> > the simplicity of parsing my own languages influenced this a lot, too.
> 
> What is it that you dislike about mainstream languages? Why?

Okay, here's the main design point: There are MANY issues you have to
think about when designing a MUD language. VERY few mainstream languages
satisfy these in a sensible manner. Here's a quick breakdown:

1) object orientation

MUDs are usually simulations of the kind that renderes EXTREMELY well
to object-oriented design. So, the language should have clean and
well-designed object system. This kills C, FORTRAN, and Pascal at the
start.

2) memory management

This is the single, most insidious bug source in all the programming
projects I dealt with. MUD environment is extremely dynamic in nature.
You /really/ don't want holes in your memory management. You really
don't want to /think/ about memory management, either. I think this
kills C++. Although Nathan wrote very elaborate refcounting allocator
for his MUD, he is still plagued by occasional glitch. I think that
a MUD language, to be at all usable, must either be garbage-collected,
or otherwise support only a sufficiently limited set of data structures
so that reference counting can be done automatically and still work
reliably (read: No pointers!).

3) team support

You should think /really/ carefully about this issue, since it
encompasses pretty much all other issues I'll mention. The problem
is some systems are geared toward supporting teams of programmers,
while some are almost unusable if more than one person does the
coding. Team support means that you have to think about it from
the very start. Here is an example from Cold: all datatypes are
immutable. In effect, if you have a list, the only way to change
it is to construct a new list (the driver will recognize the
cases when this can be short-circuited without copying the data
around). This is efficiency problem, but it's terribly important
because with this, you can freely return a list that is also
stored on an object's property, without fear that method you
return the list to will overwrite an element of this list.
Java, in particular, doesn't have this feature, meaning that you
have to /think/ about copying all the data you don't want to
/ever/ see modified. Then there is problem of dynamism: If you
never ever have to call the external compiler, then the team of
programmers can modify the system through their telnet connections.
VERY few mainstream languages can support this. I'll touch some of
the points here in more detail...

4) static vs dynamic typing

This is a sticky issue. If you decide to go with team support,
they you're better off with dynamic typing. In my experience,
it's MUCH easier to get used to (even if it has its own pitfalls,
such as the problem that you'll only know that types are wrong
at runtime). Also, you need VERY advanced static typing if you
ever hope to compete with dynamic (again, look at how ML does it).
Finally, dynamic typing makes automatic memory management easier,
as data must be tagged so that the types are immediately deducible
at runtime - this means you can actually write a good garbage
collector.

Still, I would agree that this point depends on personal taste.
If you will be the only coder, choose what suits you best.

5) persistance

Some MUDs have persistance implemented on the language level.
This means that some data (world objects, most usually) automatically
survive crashes/reboots/shutdowns. This, if you can afford its price,
is extremely useful. The price is database management, which can
become VERY complex if you want to make is sufficiently general.
In addition to that, you will also have to give thought to
backups, checkpointing, integrity, and many other things - basically,
you'll have to read a few VERY thick books on the topic.

The gains are tremendous: you will be able to modify the database
in a permanent way on the object level, without even soiling
your hands with I/O. If you have a team of programmers, this is
even more important.

6) interpretation/dynamic compilation/static compilation.

Interpreters are relatively easy to write, but can be slow. Static compilation
means that you parst the language offline (while the server is shut down)
and turn it into object files (possible by using, for example, C and C++
as intermediate steps), then link and reboot. Dynamic compilation is
VERY hard to do right, and means that code is compiled at runtime and
dynamically linked. As I said above, I'm looking at the possiblity of
dynamic compilation without linking and without disk access - again,
look at VCODE.

In my experience, interpreters are sufficient for almost anything, if
you write them well.

7) frozen world versus softcoded world

This issue deals with the following decision: You have to choose
whether you'll design your core objects externally to the admin-visible
systems, or show the internals to all the coders. The former is,
without doubt, more efficient, and you can do certain things that'd
be fordibbingly slow in the latter model. However, I suspect that
with a dynamic compilation in place, this issue becomes meaningless.

8) security

In team approach, programmers' code should run in a sandbox - it
wouldn't do if each programmer could crash the entire system just
by debugging a new version of the 'score' command. :) This encompasses
a lot of things: object access, database access (if you don't go
for automatic persistance), datatype opaqueness, and large number
of other issues. Most mainstream languages don't even begin to
deal with this issue. So, IMHO, mainstream language is not an
option for a MUD done by more than a single programmer.

9) dynamic objects/inheritance

This issue is VERY problematic. The idea is that if you go for
a dynamic object system, gain is that you're able to dynamically
reparent objects, change the methods and properties on the fly,
and do all of this through your telnet connection. The downside
is that this slows down message passing, both in interpreters and
compilers. It can still be made /suffciently/ efficient, actually.
Common LISP has full dynamic inheritance supported by its native
object system, and it's not too slow. The secret is caching - if
you don't /repeat/ method lookups when they're not needed, you'll
be well off.

I think that full dynamism is important if you're working on very
creative setting with lots of custom code, and you want this code
done QUICKLY - the programming speed in dynamic languages is so
much greater than in static object systems that it's not even fun.
It also gives great freedom in experimenting.

--------------------------

With the mainstream languages, Common LISP satisfies most of these
features. Scheme comes close, in some implementations (as Scheme
standard is pretty minimal). Neither can very well support a team
of telnetted programmers developing the world, though. Some MUDs
discussed here do.

> Maybe I used incorrect terminology. 
> 
> 5  [ PROBLEM-ORIENTATED LANGUAGE LEVEL ]
> 4  [      ASSEMBLY LANGUAGE LEVEL      ]           
> 3  [  OPERATING SYSTEM MACHINE LEVEL   ]
> 2  [    CONVENTIONAL MACHINE LEVEL     ]
> 1  [      MICROPROGRAMMING LEVEL       ]
> 0  [       DIGITAL LOGIC LEVEL         ]
> 
> These are the 6 layers to which I was referring. To move from one level 
> to another, compilation/interpretation is required. Adding another level 
> on top is what I *think* something like LPC does. Maybe, maybe not.
> 
> Is anyone aware of any sort of comparison between languages, showing 
> levels of suitability for various tasks?

The problem is that MUD design means that you've already decided on the
particular task. Out of a LARGE number of general-purpose languages, I
have yet to find *one* that supports persistance, dynamic object system
and automatic memory management in a satisfying manner. Rschem (the
implementation) comes close, but it still has a long way to go.

	Miro