[MUD-Dev] Room-based vs. coordinate-based

Thu Jun 12 21:35:45 CEST 1997

On 12/06/97 at 07:55 PM, Jeff Kesselman <jeffk at tenetwork.com> said:
>At 09:24 AM 6/12/97 PST8PDT, you wrote:

Attributions!

>>I'm working on R-tree based rooms (but I suspect I'll have to move my
>>current R-tree implementation into the driver, since it's both complex
>>and critical).

>Interesting. Care to define an R-Tree, miro? The data structure for
>some reason is pulling up a blank for me.

I posted the following here a whiles back:

- --<cut>--
What is R-Tree

A R-Tree, proposed by Antonin Guttman[1], is an index structure for
point and spatial data at the same time. Insert, delete and search can
be intermixed without periodic reorganization. It uses a tuple to
represent a spatial data in the database. In order to retrieve the
data, each tuple has a unique identifier, tuple-identifier. At the
leaf node of a R-Tree, it has index record that can reference the
spatial data. The index record is (I, tuple-identifier). I is an
n-dimensional rectangle and it is the bounding rectangle of the
spatial data indexed. This rectangle is also known as minimal bounding
rectangle, MBR. and each entry in tuple-identifier is the upper and
lower bounds, [upper, lower], of the rectangle along the dimension.
Non-leaf nodes contain entries (I, childnode-pointer) where I is the
minimal rectangle bounding all the rectangles in the lower nodes'
entries. Childnode-pointer is the pointer to a lower node in the
R-Tree. Let M and m<=M/2 be the maximum and minimum number of entries
can be filled into one node respectively.

Properties of R-Tree

A R-Tree satisfies the following properties:

    A R-Tree is a height balance tree and all leaves are on the same
    level.

    Root node has at least two children unless it is the leaf node.

    Every non-leaf node contains between m and M entries unless it is
    the root.

    For each entries (I, childnode-pointer) in a non-leaf node, I is
    the smallest rectangle that spatially contains all rectangles in
    its child nodes.

    Every leaf node contains between m and M index records unless it
    is the root.

    For each index record (I, tuple-identifier) in a leaf node, I is
    the smallest rectangle that spatially contains the n-dimensional
    data object represented by the indicated tuple.

R*-Tree

The R-tree is based on a heuristic optimization. The optimization
criterion is to minimize the area of each enclosing rectangle in the
inner nodes. R*-Tree which incorporates a combined optimization of
area, margin and overlap of each bounding rectangle in the inner nodes
was proposed in [6]. For slightly higher implementation cost, it
outperforms the existing R-Tree variants.

    Minimizing the area covered by a bounding rectangle should
    minimize the dead space. This will improve performance since
    decisions which paths have to be traversed, can be taken on higher
    level

    Minimizing the overlap between bounding rectangles decreases the
    number of paths to be traversed.

    Minimizing the margin of a bounding rectangle will make the
    rectangle more quadratic. It is because for fixed area, the object
    with the smallest margin is the square. Quadratic rectangles can
    be packed easily and thus building a smaller rectangle.

VP-Tree

Conventional spatial index structures divide the multi-dimensional
vector space into partitions which have approximately the same number
of data points as each other. It facilitates in finding the nearest
neighbor of a given query point because it is only necessary to touch
a small number of partitions. Most partitioning methods are based on
absolute coordinate values of the vector space. R-Tree and R*-Tree
described before use this type of partitioning method. The structures
partitioned in this way are useful for queries based on absolute
coordinates, like range queries. However, in general, it does not
maintain any distance information, such as distance between points
within a partition and the partition's boundaries. Since this
information is critical in pruning the search space for
nearest-neighbor search, index structures using partitioning methods
based on absolute coordinate are thus not so useful for
multi-dimensional nearest-neighbor search.

Nearest-neighbor search by definition is to find out one point with
minimum point-to-point distance from a given query point, so it is
natural to use partitioning method based on relative distance rather
than absolute coordinate values. Vantage-Point tree, or VP-Tree,
method was proposed by Peter N.Yianilos. It uses the partitioning
method based on relative distance and aims for handling
multi-dimensional nearest neighbor search.

As mentioned before, VP-Tree method bases the partitioning on the
relative distances among the data points, rather than their absolute
coordinate values. It also bases on a particular vantage point.
Actually, vantage point is nothing special but a point selected from a
vector space, or a set of data points. However, the choice of vantage
point plays an important role in the performance of indexing
algorithm.
--<cut>--

--
J C Lawrence                               Internet: claw at null.net
----------(*)                              Internet: coder at ibm.net
...Honourary Member of Clan McFud -- Teamer's Avenging Monolith...