[MUD-Dev] TECH: DBM vs BDB speeds (was: AmigaMud DB questions)

Sat Apr 28 13:33:15 CEST 2001

Jon Lambert wrote:

> Has anyone asked ....
> 
>   "But why don't you use GDBM 1.8.0 for all of your storage?" 
> 
> I think it performs much better than BDB.
> 
> I've noticed that both of these seem to operate best (for me at least)
> at a pagesize of 4K, which I don't think is a coincidence since it
> happens to match my OS page size.

We'd tried improving the performance of GDBM some, but at this point, I 
can't honestly recall which version we tried or which things we tried to 
tune.

Our problematic case though is a particularly nasty one.  It isn't our 
usual runtime problem but rather the compilation of a DB from the text 
form into the binarydb.  With a large DB (about 1.5G) consisting of 
roughly 1.1 million objects, DB compilation was taking between 6 and 13 
hours.  Through profiling, we determined that this was due to the speed 
of record insertion into our DBM-based index.  In fact, if we deferred 
all DBM insertions until after the actual compilation had happened by 
keeping a big hash of the records in memory, we could get through the 
actual compilation process in roughly 30-40 minutes, followed by several 
hours of waiting while the DBM index was populated.

I wrote a small benchmarking program that just created a given number of 
keys and started to look at tuning the performance of DBM, BDB's DBM 
emulation layer and using BDB directly.  I had no luck with the DBM or 
BDB-DBM approaches, but using BDB directly and setting a very large 
cache size (about 145M), I could insert 2.2M records in about 2 minutes. 
  Dropping that cache size had the effect of increasing the amount of 
time non-linearly.  Our working assumption became that we could (and 
should) just work with different cache sizes for compilation and normal 
runtime usage, which has been working fairly well on a test server of 
mine.  That server runs Solaris, so the comparisons against DBM for 
runtime operations has been against the Solaris DBM rather than either 
the FreeBSD NDBM or GDBM. (My original benchmarking was on FreeBSD 
though since that's what the main server that I care about runs at the 
moment.)

I would love to hear that I'm wrong or that I missed something and that 
I can get equivalent levels of performance out of GDBM. :)  It'd make 
life much easier on me to not have to introduce a dependency on BDB 
3.2.9 for Cold, or alternatively, to have to maintain 2 versions of our 
lookup index code.

> There are problems with large page sizes in NDBM and very small
> key/value pairs.  While a large page size may minimize writes at
> database creation time, too many objects on a page makes for longer
> search times.  Then again the value size limits make NDBM pretty much
> unusable anyways as a complete solution, which is why the whole whole
> bitmap/block file is needed.
> 
> Also doesn't Cold actually store 2 database records per object,
> symbol->objnum and objnum->seekaddr?  Then requires yet another seek
> and write to that bitmapped/block file.

Cold does store 2 records per object.  But the first, symbol->objnum is 
cached within the driver which avoids most of the common hits to that 
DB.  Currently, objnum->seekaddr isn't cached (I'm not sure why, this is 
old code), but I've recently introduced some code to avoid re-writing 
that value if it isn't necessary to do so.  (That change has no impact o 
the DB compilation phase which has been our main worry.)  Currently, our 
disk I/O patterns only involve the objnum->seekaddr index and the block 
file.  With BDB and an appropriate cache size, we should be avoiding 
most of the index updating.

There were some stupid things happening in that code within Cold as 
well, like serializing integral values to strings and so on, and using a 
single DBM file to store both indices, with associated hackery to 
determine which index a particular key/value pair belonged to.  I 
removed all of that, found a way to remove some strlen() calls which 
shaved about 30-45 seconds off of the 40 minute compile time and made 
the code much easier to understand conceptually.

> "I hadn't even tested storing large or randomly sized objects in
> there."
> 
> In your test then, you are still saddled with the I/O overhead of
> maintaining the "objects" file that your objnum->seekaddr provides the
> pointer into.  So doesn't Cold actually do 3 times more I/O than might
> be necessary?  Or am I missing something there?

See above.  It was strictly testing the speed of DBM operations when 
there are many many records.

Since Cold requires a decompile/recompile of the binary database when 
doing upgrades of the driver, this was very important work as it dropped 
the required maintenance window for upgrading a large game from 
somewhere over 15 hours to under 3, possibly under 2.  (For a typical 
small game, you'd be seeing decompile/recompile times that are well 
under 10 minutes, both with the old and new lookup index code.)  This 
work also resulted in us returning to being CPU bound during DB compilation.

  - Bruce

_______________________________________________
MUD-Dev mailing list
MUD-Dev at kanga.nu
https://www.kanga.nu/lists/listinfo/mud-dev