[MUD-Dev] Re: Databases: was Re: skill system

Tue Jun 30 02:57:01 CEST 1998

On 25 Jun 98, J C Lawrence wrote:
> On Thu, 25 Jun 1998,  Jon A Lambert wrote:
> 
> > The method I like best is the direct class to table approach since
> > it preserves the spirit of OO.  However, the ease and speed of
> > access is directly related to the depth of your inheritence tree. :(
> 
> Not knowing how the performance metrics of databases are related to
> table count (what happens when you start having thousands or tens of
> thousands of tables?),

The sheer numbers of tables usually isn't as important as the number 
of table joins one must navigate to retrieve an object instance.  
In any non-trivial object diagram there will usually be multiple 
paths to access an object instance.  Ideally or unfortunately this 
goes back to your design.  For example, the glowing torch in Bubba's 
hand might be accessed by starting at object Bubba and traversing 
Bubba's inventory list; or perhaps in some contexts, a short circuit 
is faster like starting at object RoomX and accessing LightSource.
There are usually hard limitations on the number of table joins 
(very vendor dependent).  Although I cannot fathom a mud having 
1000+ tables.  As a matter of fact, I haven't worked with a single 
application, IRL, that had more than a couple hundred tables.  
Although I've been in shops that have 1000's of tables, but this was 
for their entire suite of applications.

>  as well as table size (what impact does the row
> count have on performance, especially when compared to the percentage
> of "interesting" rows?  (eg which is better: a single 10,000 row table
> with 20 interesting rows, or a hundred 100 row tables each of which
> has small chance of having anything interesting?))

Generally one attempts avoids to avoid tablescans.  Tablescans are
where the DB is required to examine every row of a table in order to
provide a solution set.  Having all tables keyed (indexed) with OID
prevents tablescans with associations.  The size of the tables makes
very little difference.  Index leaves are very much like nested hash 
tables, so access is relatively constant.  Good RDBMS's have 
utilities to rebuild "intelligent" index leaving as tables grow and 
shrink.   Some even measure key cardinality and use that as input 
into rebuilding the index leaves.  Using OID of course makes 
cardinality useless.  

You may be talking about what I call adhoc querying or "what if" 
queries.  Like, How many steel broadswords are there in the 
world?

   :select count(*) from weapons where weapon.type = 'broadsword' and
        weapon.material = 'steel'

This would require an entire traversal of the weapon table.  
(Note:  This assumes you've implemented weapon as a distinct 
descendent of something else and you know to start there!)

Now if you define an index on column material in table weapon,  a 
much smaller  subset of the weapon table is scanned.  
(Note again:  Some rdms's can be very smart in determining the 
optimal order of evaluation with the AND operation, others depend on 
strict left to right evaluation)  

The differences in time can be dramatic.  OTOH, how frequent is this 
query?  If it's quite frequent in softcode (economic simulation 
perhaps?), then the index makes sense.  In addition, the more indexes 
on a table , the time to insert a row is increased.   The solution is 
measurement and tuning, as new features are added into the server.

> I've been playing
> with abstracting the internal logical relationships from the DB
> entirely, and almost using the DB only for access and storage, and
> leaving the logical interpretion of that storage to the internal
> language.

Exactly.  A translation layer. 

> Don't know if its a useful approach (remember, I've been out of the DB
> world for nigh on 10 years now). but the basic idea would be to have
> one table which held only ObjectID's (a system-wide primary key) and a
> deleted/live status for that object).  Another table would hold tuples
> of ObjectID's and method definitions (soft code and byte code).
> Potentially each method definition would be its own table with one row
> per referencing object, and one collumn per internal state variable
> (and two for the code representations).  Another table would hold
> tuples of parent and child inheritance ObjectID's.  Etc etc etc.  All
> very very simplistic.  An entire object definition (which would
> enclude its state) would be the product of the orws from a very large
> number of tables.  The actual intelligence of the DB in regard to
> processing object characteristics doesn't get used at all -- that's
> all abstracted into the internal language which maintains the sense of
> the structure in its own logical constructions.
>

Hmmm...  First and second thoughts.

Construct an abstract object model of the object model that the 
softcode follows.  Even if you not using relational technology, I 
think this is the way to go.  How many OO-DBMS's or P-Store systems 
follow your softcode OO model?   Probably none, right!?  

Perhaps we should be discussing softcode OO models rather than
DB implementations.  Is the softcode-OO-model like Java, C++, 
ColdC, LPC, etc.?  Single or Multi inheritence?  Is there a 
distintion between object and class?  Are inheritence and attributes 
dynamic?  Are there built-in or native objects?  

> I'm not even going to comment on search/access overheads for the above
> as I haven't even tried it yet.
> 
> This of courses raises a fundamental design point: To what extent
> should the DB be a reflection of, or should support and or directly
> represent the logical construction of the soft code language?

I think directly and as part of the softcode language.
There are a couple analogies I can make, that may or may not 
fire a neuron.

a) softcode accesses objects like one would use java reflection.    
b) softcode has to builtin objects like javascript.  That is 
imagine the translation layer == browser.
c) softcode has builtin collections like VB has DAO, Controls, etc.  

>  A
> persistent store is the ultimate in tieing the DB side to the logical
> structure yada yada of the language.  The standard MUSH/Tiny-*/Cold
> dbm approach is far out at the other end of the scale in making the DB
> represenation opaque in its relation to the language internal
> structure.
> 
> I realise that the above is heavily counter to the "heavy DB" model of
> the universe.  Still wondering why I should care however.

If one takes a close look at commercial (and freeware) OO-DBMS's, you 
will notice that in many cases the package includes an access 
language that reflects the particular architecture, perceptions and 
interpretations of what the designers' think is OO.

What you (and I) are doing is really designing the architecture 
of an OO-DBMS. 

> Note: Metrics on MySQL performance can be found at:
> 
>     URL:http://www.mysql.com/benchmark.html
>

Neato.  What's this Oraxle 1.0 thing?  Upon first (mis)reading my jaw 
dropped thinking MySql beat the crap out of Oracle!?!

--
--/*\ Jon A. Lambert - TychoMUD     Internet:jlsysinc at ix.netcom.com /*\--
--/*\ Mud Server Developer's Page <http://www.netcom.com/~jlsysinc> /*\--
--/*\   "Everything that deceives may be said to enchant" - Plato   /*\--