I am currently working on kind of multi-graph structure. The thing is that two graphs are needed for my application which span the same vertices but with two distinct set of edges (i.e., a multi-graph). The first graph is a binary search tree, while the second graph must be general (directed / undirected, or bidirectional, possibly with cycles, etc.). The B-tree is used for fast queries of nearest nodes, while the second graph is needed for path-planning (e.g., in the style of shortest-path algorithms). Both graphs have to be completely mutable (reconnections, moving nodes around, adding/removing nodes and edges to either graphs). So far it's simple enough, just a multi-graph. A typical implementation of such a multi-graph (and similarly, a multi-index) would have all the nodes in one array (or list) and the two edge-sets as adjacency lists within the nodes, or, it would have two graphs that indirectly point to nodes in an array. This is simple to deal with because the nodes remain in fixed locations (or indices) in the array or list, and edges are easy to rewire and keep consistent with the node list.

However, for efficiency reasons (i.e., to solve locality of reference issues), I want to use specific memory layouts that are suitable for the binary search tree. In particular, I have an implementation of a breadth-first layout for the tree (i.e., BF-tree), as well as a cache-oblivious B-tree (or COB-tree) which have both proved to be orders of magnitude more efficient than a general adjacency list implementation (or linked-list style implementations). Also, the work to be done on the overlaid graph (the general adjacency-list graph) is expected to be local with respect to the partition of B-tree nodes, giving me even more locality of reference performance benefits. So, it is natural to use the B-tree as primary storage with adjacency-lists at the nodes to represent the overlaid graph. Now, of course, this becomes complicated when you allow both graphs to be mutable because nodes have to physically move around when restructuring (or rewiring) the B-tree (by self-balancing algorithms), and then the overlaid adjacency-list graph has to be kept consistent with the movements of the nodes. I anticipate quite a bit of trouble getting this to work correctly (let alone provably so by formal methods).

So, here's the question: I'm sure I'm not the first guy to get this idea and that it has been done before, but I have no idea what this is called in CS literature, does anyone know? Has anyone seen algorithms (or libraries) that implement this kind of pattern?

BTW, I use C++ and I use the Boost.Graph Library as the basis for my additional graph algorithms and data structures.

All 12 Replies

Generally speaking, having different linking structures overlaid on the same information is a very normal thing. A common example is when you want a one-to-one bidirectional dictionary, where you can lookup values by key or keys by value.

The appropriate solution is situational. If you have, for example, things concurrently iterating through a tree, your data structure would have certain limitations than they would otherwise lack.

There are many ways to link stuff together. To get the language straight, let's say you have values of type `V`, with `x.key()` being the key (of type `K`) by which the value `x` (which is of type `V`) gets compared.

Naturally one way to implement your tree and graph is to have a balanced BST consisting of BST nodes of type `BSTNode` each containing a value of type `V`, and an adjacency list which you might say is a `List<BSTNode *>`. (I'm just repeating this so that if this is inconsistent with what you've said, it's now obvious.) The BST is ordered by the comparison `x.key() < y.key()`.

So obviously, to improve locality of lookup, you put things in a B-tree. (Might I ask: How many keys are in your B-tree nodes? If your b-tree nodes are wider than an L3 cache line, are there really locality advantages? (In prefetching, perhaps?) If you're hitting swap, buy more RAM. If it's 1 TB, buy some SSD's.)

So now you have a b-tree, or generally some kind of lookup structure that has V values packed in nodes, be it all the nodes or just leaf nodes, and these values need to have a graph over top. So the physical location of nodes may need to move.

(1) One option is to simply, for each value, have an adjacency list and a list of back-pointers. If `x`'s adjacency list contains the memory address of `y`, then `y`'s back-pointer list contains the memory address of `x`. (So `x` and `y` would each be `struct VStruct { V value; List<VStruct *> adjacency_list; List<VStruct *> back_references; };`. Naturally we can argue whether very small lists should be packed inline, and how they should be done so. Maybe there isn't a particular VStruct type, everything is in some flattened and packed region of a big char array in the b-tree node, so that small adjacency lists and back-reference lists don't require accessing memory outside the node. And maybe they turn into BSTs when they go off-node, or when they get sufficiently large. Or maybe not.) (Also, you make your b-tree node structure that contains these things work in a way such that nodes rarely move. There is some kind of garbage collection scheme going on, or maybe they're just too small for that, or you are just generally greedy. Anyway there should be no more than O(1) amortized moves of nodes when something gets inserted into a b-tree node.)

Now that's one option for how to do things. You've got the graph structure hooked up to the nodes themselves, there. When you move a node, you update things appropriately. (Naturally, you make your b-tree node structure such that nodes are not moved every time.) The problem might be that these nodes are so big, no matter how well you try to pack and compress them, that you get a poor branching factor. Locality of btree lookups might be improved if adjacency lists weren't inline.

(2) Take option (1) and modify it so that every b-tree node doesn't contain adjacency lists. Instead it contains a pointer to a partner node that contains the adjacency/back-pointer lists. Each value in the b-tree node has a pointer (or just a 16-bit offset) to the location of its adjacency list in the partner node, and likewise in the reverse direction. Represent the adjacency lists somehow so that all the information is stored flat in the partner node and relatively local.

The advantage of this scheme is that pure b-tree operations are faster. Also, pure graph operations are faster, you still get the locality benefits of nearby keys and there are more local keys.

(3) Back-references are annoying, no? Instead of back-references, store deletion entries in your node that forward to other locations. For example, when you need to split a b-tree node, replace the removed values with deletion entries pointing to their new locations. Then, whenever you follow a pointer to a location, check if it's a deletion entry, in which case update your pointer to point to the new location (and perhaps decrease a reference count on the deletion entry, and increase a reference count on the new location, if that in any way is helpful). Every once in a while do a garbage collection pass on the whole graph, if the global count of deletion entries gets too high. Or do it incrementally -- just gradually visit all the nodes of the graph and update any value pointers that point at deletion entries. (Also try local passes if the deletion entry count of a given node gets too high? Depends on the distribution of the tail of non-local graph entries.)

Deletion entries are especially useful if they're much smaller than a value and its corresponding adjacency list.

maybe Also represent pointers smartly. Allocate your b-tree nodes inside a larger breadth-first tree of 32-k aligned blocks (which further fits within a breadth-first tree of 1GB aligned blocks, perhaps). Then represent pointers within the same 32-k block with 16-bit values whose LSB is 0, represent pointers within the same 1GB block with 32-bit values whose least two bits are 00, represent pointers outside that block with 64-bit values whose least two bits are 01. Right-shift the pointers by 1, 2, and 2, respectively, and then bitwise or them into the right page, before using them, of course. Maybe also have 8-bit references to the current node, if that's convenient. (And then 16-k blocks, 512MB blocks, since you have fewer bits.)

Thank you very much for the detailed answer! That's really helpful! Very much appreciated!

Let me layout a few terms that I'll use here to describe my setup:

B-tree: a breadth-first memory layout to store the nodes, their associated data, and the data associated to the edges of the tree (the details of this implementation are not so relevant). This component exposes a standard tree interface (set of functions like add-child, get-parent, etc.).
BST logic: an implementation of a binary search tree (in the very general sense). In actuality, this is a space partitioning tree (not necessarily binary either), so the logic is a bit more complicated than a simple red-black tree, for instance. Overall, the idea is that this code handles all the operations to look-up nodes, insert / delete, re-balance and so on. This code only deals with the standard tree interface (it is agnostic of the underlying data structure).
BST-data: data stored at the nodes or tree-edges that is used by the BST logic to do its work.
Adj-list graph: a general graph which has the exact same node-set as the B-tree, but allows for general inter-connections between them, i.e., an adjacency-list graph seems to be the only viable pattern to represent it. This component exposes a standard graph interfaces (those of the Boost.Graph Library).
Graph-algorithm: the algorithm (or set of algorithms) that use and require that general graph structure over the nodes. This code deals only with the standard graph interfaces.
Graph-data: data stored at the nodes or graph-edges that is used by the graph-algorithms.

So, my goal is to create a component which can use the B-tree as primary storage (i.e., for both BST-data and graph-data on the nodes) and expose two separate interfaces (or views), a tree interface and a graph interface, whose operations are linked together in such a way that everything is consistent all the time. The two benefits are that consistency work is encapsulated and that it will result in better locality of references.

So obviously, to improve locality of lookup, you put things in a B-tree. (Might I ask: How many keys are in your B-tree nodes? If your b-tree nodes are wider than an L3 cache line, are there really locality advantages? (In prefetching, perhaps?) If you're hitting swap, buy more RAM. If it's 1 TB, buy some SSD's.)

In my target application, the number of nodes can vary between tens of thousands to a million (it's a constructive method, and the number of nodes can vary quite a bit depending on many factors), and the node-data can be about 100-200 bytes or so. I haven't really looked at the actual sizes with respect to cache levels and so on, I mostly was led to using a B-tree from empirical testing. In my tests with the BST logic, I found that even though the average number of comparisons for look-ups was O(log(N)), the actual average running time grew linearly with the number of nodes. After ruling out everything else and optimizing the BST logic code, I ended up replacing the initial linked-list-style BST structure with a B-tree which resulted in a tremendous performance improvement giving the theoretically-expected log-time look-ups and amortized log-time insertions / deletions (that is, O(log(N) + k log(k)) where k is amortized to k << N). I also tried a cache-oblivious B-tree (or COB-tree) which also showed good performance (log-times) but with a higher constant factor, it is only in the much higher node-counts (hundreds of millions and up to a billion) that it made a difference (as expected, when swap-memory pre-fetching kicks in).

(1)

This option that you described is pretty much exactly how I intended to do it. Here are some comments:

One option is to simply, for each value, have an adjacency list and a list of back-pointers. If x's adjacency list contains the memory address of y, then y's back-pointer list contains the memory address of x. (So x and y would each be `struct VStruct { V value; List<VStruct *> adjacency_list; List<VStruct *> back_references; };`.

This is a given for me. I will have back-pointers (or in-edge lists) for sure. For one, I use Boost.Graph Library as a basis for interface specifications on the graph operations. In BGL, the graph concepts are pretty much set up to make it a requirement that in-edges for a given node can be obtained directly. This also makes sense in a graph-algorithmic sense, because most (or, at least, many) graph-algorithms need to be able to easily walk backwards in a graph.

Naturally we can argue whether very small lists should be packed inline, and how they should be done so.

Yeah, I thought about inlining the adj-lists in the nodes. It is a good idea, and would probably be beneficial w.r.t. memory access patterns. The problem is that it requires a limited fan-out (or branch-factor) on the graph, and I'm not sure I can easily deal with that in my graph algorithm (and I would have to run empirical tests to gather statistics on the average fan-outs generated by my probabilistic graph-algorithms to see how much of a waste it would be and about the effect of the limited fan-out). So, for now, I'll stick to a dynamic array (or list) for the adj-lists.

Also, you make your b-tree node structure that contains these things work in a way such that nodes rarely move. There is some kind of garbage collection scheme going on, or maybe they're just too small for that, or you are just generally greedy. Anyway there should be no more than O(1) amortized moves of nodes when something gets inserted into a b-tree node.

There are two separate things here, the B-tree and the BST logic. As I said already, the BST logic has insertions / deletions at O(log(N) + k log(k)) where k is amortized at much less than N, and k represents the actual number of nodes that need to move as a result of the insertion / deletion, i.e., for (local) re-balancing. Then, one must distinguish between actual relocation of nodes in memory and relocation of nodes with respect to the tree structure. Actual movements of nodes in memory (like copying them from one storage to a new (bigger or smaller) storage) is not an algorithmic issue at all because it is easy to make invariant references to the nodes that are unaffected by this physical relocation. And, of course, the B-tree storage is amortized constant with respect to needing to do such relocations (that's actually a feature inherent to B-tree layouts, in addition to storing it in an amortized-constant storage (e.g., `std::vector`)). The main problem is dealing with local re-structuring of tree branches and maintaining the adj-lists (and graph in general) consistent. Ideally, I would like to do this without being too intrusive in the B-tree (or not intrusive at all).

The problem might be that these nodes are so big, no matter how well you try to pack and compress them, that you get a poor branching factor. Locality of btree lookups might be improved if adjacency lists weren't inline.

That was pretty much my (tentative) conclusion too about not inlining the adj-lists. ("tentative", because only empirical evidence can support such a conclusion, of course)

(2)

Take option (1) and modify it so that every b-tree node doesn't contain adjacency lists. Instead it contains a pointer to a partner node that contains the adjacency/back-pointer lists.

This is actually a great idea! And it's so simple too! If I store the nodes in the B-tree as, for each node: a reference to the partner-node; the BST-edge-data lists; the BST-node-data; AND, the graph-node-data. Then, I'll get really good locality of reference for both BST look-ups and for accessing the graph-node-data. And I'll only suffer locality problems when I fetch the adj-list information for a node (which I'm gonna suffer anyways unless I inline the adj-lists). This is really cool because synchronizing the two graphs becomes much easier with this decoupling. As Andrew Koenig would say, you can solve any problem in additional indirection. But, in this case, even though the idea is to avoid indirection, the problems can be solved with a well-positioned indirection. I like this option very much. It will probably be my first try. Can't believe I didn't think of that before.

Represent the adjacency lists somehow so that all the information is stored flat in the partner node and relatively local.

That's interesting... Can you elaborate? I thought it would be pretty hard to flatten adjacency-lists, near impossible, unless it is limited fan-out and inlined. Also, how would I or could I go about to try and make the placement of the partner nodes relatively localized in a manner that reflects the localization of the B-tree nodes, without having to tie them too strongly to the B-tree layout (otherwise, it boils down to the same problem as having no partner nodes at all).

(3)

I'm not convinced there is a need for that in the adjacency-list, and even less so when considering I do need in-edge lists. As for the B-tree, well, it obviously doesn't need that.

It would certainly be useful to have a heap to manage holes inside the storage of the partner nodes, to avoid unless copying of nodes, when there are many deletions / insertions in random locations in the partner node list.

However, I will need a repository of deleted nodes to represent the B-tree nodes that are temporarily taken out during the re-balancing by the BST logic. Again, I'd like that to be the least intrusive possible on the BST logic code, so it would make sense to cache the deleted tree-nodes in order to revive them when the BST logic re-inserts them into the B-tree. Implementing this well might be the main challenge in this problem.

maybe Also represent pointers smartly. Allocate your b-tree nodes inside a larger breadth-first tree of 32-k aligned blocks (which further fits within a breadth-first tree of 1GB aligned blocks, perhaps). Then represent pointers within the same 32-k block with 16-bit values whose LSB is 0, represent pointers within the same 1GB block with 32-bit values whose least two bits are 00, represent pointers outside that block with 64-bit values whose least two bits are 01. Right-shift the pointers by 1, 2, and 2, respectively, and then bitwise or them into the right page, before using them, of course. Maybe also have 8-bit references to the current node, if that's convenient. (And then 16-k blocks, 512MB blocks, since you have fewer bits.)

What you seem to be describing is a cache-aware B-tree implementation, with some bit-trickery. From CS literature on the subject, it seems that cache-aware implementations are a waste of time because they are really hard to do correctly and the benefits are often not what you might hope for (in part due to the fact that memory-models used to design cache-aware data structures and algorithms can rarely match the complexity of real-world architectures). BTW, locality of reference issues and cache-optimization issues are two separate problems (but both are memory access pattern problems). It seems that cache-oblivious implementations are much better because they scale well to any kind of actual memory-architecture and are particularly popular in large-scale systems (large databases, clusters, super-computers, etc.). As for bit-trickery, in my experience, such tricks rarely pay off (and come with a number of annoying low-level issues), but maybe I'm just not good enough in nano-optimization (I'm barely getting a hang of micro-optimization).

Anyways, thanks again, you've given me quite a few good ideas and things to think about. I'll be glad to hear more thoughts on this.

I'm going to reply in several separate posts because it's hard to organize everything in one, and the points are basically unrelated.

What you seem to be describing is a cache-aware B-tree implementation, with some bit-trickery.

The purpose of that is not to be cache aware (since cache lines are neither 32k nor 1GB), it's to make pointer sizes smaller so that you can pack stuff more tightly in memory.

B-tree: a breadth-first memory layout to store the nodes, their associated data, and the data associated to the edges of the tree (the details of this implementation are not so relevant). This component exposes a standard tree interface (set of functions like add-child, get-parent, etc.).

BST logic: an implementation of a binary search tree (in the very general sense). In actuality, this is a space partitioning tree (not necessarily binary either), so the logic is a bit more complicated than a simple red-black tree, for instance. Overall, the idea is that this code handles all the operations to look-up nodes, insert / delete, re-balance and so on. This code only deals with the standard tree interface (it is agnostic of the underlying data structure).

It is quite important that you explain what this actually is. The ostensible BST is not even a binary tree. So how can you possibly call it a BST, then? I thought you were replacing your binary search tree with a B-tree. What is this "B-tree" then, if it's being used to hold what was once in some non-binary tree? And what are its keys by which it sorts values? Why is there both a B-tree and a space partitioning tree?

The notion of a space-partitioning tree is that a given node represents some subset of space, no? Is it important to you that this space-partitioning tree structure be precisely preserved? What do you actually want the space-partitioning tree for?

How wide are the b-tree nodes?

Yeah, I thought about inlining the adj-lists in the nodes. It is a good idea, and would probably be beneficial w.r.t. memory access patterns. The problem is that it requires a limited fan-out (or branch-factor) on the graph, and I'm not sure I can easily deal with that in my graph algorithm (and I would have to run empirical tests to gather statistics on the average fan-outs generated by my probabilistic graph-algorithms to see how much of a waste it would be and about the effect of the limited fan-out). So, for now, I'll stick to a dynamic array (or list) for the adj-lists.

The point is that small lists should be flattened, large ones will need to use another contiguous chunk of the heap.

That's interesting... Can you elaborate? I thought it would be pretty hard to flatten adjacency-lists, near impossible, unless it is limited fan-out and inlined. Also, how would I or could I go about to try and make the placement of the partner nodes relatively localized in a manner that reflects the localization of the B-tree nodes, without having to tie them too strongly to the B-tree layout (otherwise, it boils down to the same problem as having no partner nodes at all).

If you have a partner node containing representations of the tuples `(pointer-to-space-partitioning-thing, outward-adjacencies, inward-adjacencies)`, the representations themselves might look like this in memory:

``````uint16_t offset_of_space_partitioning_thing;
...
...
// now another one
uint16_t offset_of_another_space_partitioning_thing
``````

If `num_outward_adjacencies` is too big then it would look like this:

``````uint16_t offset_of_space_partitioning_thing;
...
``````

The void pointers point to the `offset_of_space_partitioning_thing` field of another graph entry. (Then you round down to the nearest multiple of 2^n memory address to find the beginning of the partner node, which contains the pointer to the b-tree node of which it is a partner, since the partner nodes were allocated with `posix_memalign`. Also, really the void pointers might actually point to an offsets list at the beginning of the partner node, so that you don't have to update all of them whenever you move the graph entry around.)

If you do that naturally you have to manage memory within the flat partner nodes.

That's something that naturally happens with disk-backed storage, whether it's appropriate here depends on the size of the partner and b-tree nodes.

(Are the b-tree nodes not flattened as well??? That is generally the point of b-trees...)

In my target application, the number of nodes can vary between tens of thousands to a million (it's a constructive method, and the number of nodes can vary quite a bit depending on many factors), and the node-data can be about 100-200 bytes or so.

It is quite important that you explain what this actually is.

The ostensible BST is not even a binary tree. So how can you possibly call it a BST, then?

By "not necessarily binary", I really should have said "d-ary". In other words, it's a d-ary search tree. If d = 2, then it is binary, otherwise it is not. This is simply because it doesn't matter, implementation-wise, whether d = 2 or not. The only point that matters is that it is a fixed fan-out tree.

I thought you were replacing your binary search tree with a B-tree.

Those two things are separate concepts. A binary search tree (like a red-black tree) denotes the logic (or algorithms) used to construct and maintain the tree structure. The B-tree is one way to store that tree. And, of course, it goes without saying that the concept of a B-tree works just as well for a d-ary tree. So, by B-tree, I really just mean a breadth-first layout for a fixed fan-out tree.

You can't "replace a binary search tree with a B-tree", you can choose to store the binary search tree in a contiguous array with a breadth-first layout (i.e., a B-tree for short). But there are alternative storage strategies, like a COB-tree, or even a linked-list-style tree (each nodes has pointers to child nodes, and so on), although the latter is going to have very poor performance. The point remains, the BST logic and the storage strategy are two independent entities. The proof is that in my implementation, the two are entirely separated (and I can easily swap out the kind of data structure used by the BST logic, in fact, it is a template argument of the class).

What is this "B-tree" then, if it's being used to hold what was once in some non-binary tree?

A D-ary breadth-first layout in a contiguous array. You can see the implementation here.

And what are its keys by which it sorts values?

The keys are position values associated to the nodes.

Why is there both a B-tree and a space partitioning tree?

The B-tree is the storage strategy. The space partitioning is the tree construction and maintenance strategy. They work in tandem, like in any other search tree implementation. I don't see anything unusual with that, it's just a modular design.

The notion of a space-partitioning tree is that a given node represents some subset of space, no?

Each node represents a partitioning of the space, so yes, each node represents a subset of the space.

Is it important to you that this space-partitioning tree structure be precisely preserved?

Yes, otherwise it is useless.

What do you actually want the space-partitioning tree for?

The space-partitioning tree that I have implemented is what is called a Dynamic Vantage-Point tree (or DVP-tree for short). The key values are the positions of the nodes (in some space) and the partitioning is done on the basis of a distance-metric between any two nodes. The construction algorithm basically picks a "vantage-point" among the data points to be partitioned, using some method (either picked at random, or with some outlier detection algorithm, or whatever). Then, it puts all the points with larger distance values from the vantage-point in one child, and all the points with smaller distance values in the other child, and proceeds like that recursively. If the tree is not binary (i.e., d-ary), then there are simply more children representing layers closer to the vantage-point and layers further from the vantage-point. And, then, there are a bunch of additional algorithms to take care of deletions and insertions, in the most economical way that keeps the tree balanced. These are typical space-partitioning things, the only special thing with the DVP-tree is that it does require the underlying space to be structured in a particular way, it only needs a valid distance-metric. You can see the implementation here.

The purpose of this space partitioning tree is to do efficient (log-time) nearest-neighbor queries. That is, ask it to find the list of nodes that are the nearest to a given position, or within a specified radius around the given position. This is just something that I need in my graph-algorithms which sample random points in the space, ask for nearest neighbors, and then attempts to make new connections to for a graph that represents the connectivity of the space. This is a motion-planning algorithm over a general space.

Now, the only reason why I originally talked about a BST and a B-tree, when in reality it is a DVP-tree and a d-ary BF-tree, is simply because, conceptually, they are of the same nature and are unrelated to the problem at hand. My problem is not about the specifics of these implementations (they are done). The problem I have applies just as well to an actual BST with an actual B-tree, I even could have used that as a toy-example. For the implementation of the adjacency-list overlaid on a breadth-first layout, it really doesn't matter what logic is used to construct the B-tree, nor do the details of the breadth-first layout matter either. In fact, I don't plan to modify those implementations, because I don't need to. The terms BST and B-tree are just simpler and more familiar to people, and it gives them the right picture of the problem at hand.

The point is that small lists should be flattened, large ones will need to use another contiguous chunk of the heap.

I beg to differ. The point is not so much about size, but about the uniformity of the sizes. If you have a tree like a binary search tree, then the number of out-going edges is not only predictable but fixed (except for leafs). If you have a general graph you could have wild variations in the number of in-going or out-going edges. Now, if you want to flatten them, you need limited the edges to a fixed maximum number. If the max is large with respect to the average adj-list size, then you will be able to accomodate the more extreme scenarios (many edges on one node), but you will also waste a lot of memory. If the max is small, it might have a bad effect on your algorithm (limits the connection-richness of the graph). The ideal situation is to have uniform and predictable edge counts (like in a d-ary tree), in which case, edge-lists can be inlined with minimum waste. I cannot guarantee uniform or predictable edge counts for the adjacency-list graph, so I can't inline them.

As for the actual look of the node structures, you didn't need to spell that out. I wasn't born yesterday.

The main question I had about that is when you said "stored flat in the partner node and relatively local". How do you achieve locality of the partner nodes with respect to each other without having to tie their placement directly to that of the B-tree nodes? Say you have three nodes A, B, and C, with partner nodes Ap, Bp, and Cp. The B-tree nodes (A,B,C) happen to be laid out next to each other in memory (presumably because the space-partitioning deemed them to have space-positions close to each other). Naturally, if you are looking at nodes (A,B,C) you might eventually want to look at nodes (Ap,Bp,Cp) to get graph-edge information, but, the problem is, these three nodes are likely to be in wide-spread positions in memory. If you want them to be close to each other, you'd have to lay them out the same way as the B-tree, meaning you might as well store them in the B-tree directly. But you said to store partner nodes "relatively local". Is there a way to loosely get better locality? Like guaranteeing that (Ap,Bp,Cp) are not worlds apart in memory.

Don't get me wrong, I'm totally happy even if there is no way to keep partner nodes relatively local, I still like that solution a lot. But if there are algorithms that achieve this, I'd be interested to know.

Are the b-tree nodes not flattened as well???

Yes, of course they are. In my implementation, the nodes are defined as:

``````struct value_type {
int out_degree;
vertex_property_type v;
edge_property_type e[Arity];

value_type() : out_degree(-1) { };
};
``````

Where `vertex_property_type` are the user-defined properties associated to the node, and `edge_property_type` are the user-defined properties associated to the edges. Both of these can be empty types, of course.

No. When I say node-data, I literally mean the "user-defined properties" type that you see in the above snippet. It does not include any graph or tree structural data (like adjacency-lists). What these properties hold depend on the application, in this case, for the motion-planners, that could be quite a lot (a state-vector, covariance matrices, cost-function values, heuristic values, etc.), it can total to quite a bit of data. So, I'm not too concerned with saving every possible bit by using special 32k segments and 16bit offsets, I'll just stick to native word sizes and properly aligned data, even if that means some waste in padding which is neglible compared to node-data and edge-data.

I think there was a bit of confusion because your use of the term B-tree is not the standard use of the term, combined with your mentions of BST. In particular, one does not take a BST structure and then wrap it inside a B-tree. A B-tree is a structure that replaces a BST, for example every leaf node is of the same depth. The B-tree structure I am thinking of is described at http://en.wikipedia.org/wiki/B-tree .

Because the BST structure exists to provide key ordering (and balancedness) and the B-tree structure exists to provide the same thing, they're redundant. However, what you were really describing was not a balanced BST, it's a balanced BT. Putting a balanced BT inside a B-tree (with nodes ordered how?) can make sense. (Incidentally, putting nodes whose locations are ordered by some comparison of points in R^n space can be done somewhat sensibly using a continuous space-filling curve based comparison function.) However, I don't think this is what you're doing.

You can't "replace a binary search tree with a B-tree", you can choose to store the binary search tree in a contiguous array with a breadth-first layout (i.e., a B-tree for short).

So now it is more clear what you are doing, I think. You are storing the root of the tree in an array of index 0, its children in indices 1 .. d, its grandchildren in indices d+1, ..., d + d^2, its great grandchildren in indices d+d^2+1, ..., d+d^2+d^3? That's what breadth-first layout means according to one paper I found.

If you were to, say, store a balanced binary tree in such a structure (for example, a red-black tree), you might run into trouble, since the depth of leaf nodes can vary by a factor of 2, and a tree with more than sqrt(2^47 / sizeof(node)) entries could ostensibly overflow the address space of a 64-bit system (because 2^47-1 is often the highest addressable positive memory address on 64-bit systems). However, that's different from the expected height, which is (I'm guessing) more like 1.5 or 1.7 times `log_2(N)`, not 2 times `log_2(N)`.

As for the actual look of the node structures, you didn't need to spell that out. I wasn't born yesterday.

I apparently needed to spell it out more, because you didn't understand what I tried to say. This confusion makes sense if you are packing nodes in an array, since they'd have to be equal width.

When you pack nodes in a B-tree as per the normal definition, they don't have to be equal width. You can have values be a different size. For example, the nodes of a b-tree might be 4-kilobyte blocks, which contain a list of serialized values, all of different width. When you get to a node, iterate through to find what key you're looking for. (Of course it's really more complicated than that, because you'd rather do a binary search through 4 KB of data, so at the beginning of the node you have an array to the offsets of the values, and the array is sorted by the value it references, so that you can do a binary search, and adding/removing values from the structure takes amortized constant time, and when you write it to disk you just do the system call write(...) on the block, and it's memory-aligned for directed I/O.)

Since you're apparently fitting within memory, large page sizes aren't that useful, so there's not much room for creativity in that regard.

Adapting to your breadth-first array scheme, you can have one partner node for a group of nearby d-ary tree nodes (for example, a parent and its d children) and then each adjacency list can be variable-sized and stored flat in the partner node. This is useful if the average adjacency list is quite small (and the partner node is a handful of cache lines). Just treat the partner node as an arena allocator, and when it gets full, compact the live entries into a new partner node. So the partner node probably ends up containing singly linked lists of variable-sized chunks. If you get the unusually large adjacency list, encode it to be a pointer to an array on the heap. But this is only worth doing if you think they're relatively small.

If they're generally big, you can make your adjacency lists be vectors or chunked lists and that's pretty much the best you can do, assuming all you do is iterate over adjacency lists, and insert or remove entries once you've found them.

Both of these can be empty types, of course.

I assume you know they'll end up taking up space.

I'm really sorry about all the confusion about the terms. I just realized, after my previous post, that the B-tree term also refers to this specific type of search tree. In the literature I have been reading, they used B-tree as a short for "Breadth-first layout (or tree)". As for BST, I thought it was a general term (umbrella term) for all binary search tree algorithms (in the same way as binary space partitioning (BSP) is an umbrella term for all sorts of different space partitioning techniques). I guess by trying to avoid talking about irrelevant details, I caused all sorts of confusion about what I was talking about.

To make it clear, the basic idea of a breadth-first layout is this. Say you have a binary tree, with node A as root, (B1,B2) as children, and (C1,C2) as children of B1 and (D1,D2) as children of B2. Then, you layout the nodes in a contiguous array as so: (A, B1, B2, C1, C2, D1, D2). This has the effect that if you do a breadth-first traversal of the tree, it is the same as doing a linear traversal of the array (hence, the name "breadth-first layout"). When you do a depth-first traversal or a searching for a particular key, the memory access pattern is also pretty efficient because it always goes forward in the array (as opposed to randomly jumping back and forth, which is what happens if you have a linked-list-style storage). And such access patterns are very favorable for good pre-fetching performance.

And, you are right, when storing in such a layout, the tree-balance becomes even more important because the underlying array has to grow to a size that accomodates the deepest branch of the tree. That's why my space partitioning method is fairly aggressive at re-balancing (and my tests show good balance).

If they're generally big, you can make your adjacency lists be vectors or chunked lists and that's pretty much the best you can do, assuming all you do is iterate over adjacency lists, and insert or remove entries once you've found them.

I do understand what you mean now about how to inline those variable-size adj-lists. But I'm afraid the complexity of dealing with it won't buy me much results at the end (I've done similar things before where it didn't pay off). I can hardly predict the sizes (or distribution of sizes) of these adj-lists, and I also don't expect to do any fancy traversal operations, only iterating through in- or out-edges of a given vertex.

I might consider doing a more customized adj-list implementation later, but for now, I'm gonna start by using a generic adjacency-list implementation (from BGL) to store the partner nodes, and do all the synchronization between the tree and the adj-list externally to both implementations. This is nice because I don't have to modify or copy-and-modify any of my previous code (the breadth-first layout tree, and the space partitioning logic) and I can re-use the adj-list implementation from BGL as is (supplemented with a graveyard heap). If I decide to make a specialized adj-list implementation for this purpose (e.g., to inline adj-lists, and try to maintain some locality between partner nodes), I can easily implement that as a replacement for the generic adjacency-list implementation.

I'm almost done implementing this, so I'll let you know what's up with it. Thanks again!

And such access patterns are very favorable for good pre-fetching performance.

That's true only at the base of the tree. If you're jumping from index i to 2 * i + 1, for any large i, you won't get prefetching performance from that reason.

Grouping every 1+d or 1+d+d^2 trees into contiguous blocks should ostensibly improve things, even though you say a COB-tree is not worth it. It's an isolated change to how parent/child indices are computed, so it's do-able, too.

That's true only at the base of the tree. If you're jumping from index i to 2 * i + 1, for any large i, you won't get prefetching performance from that reason.

You're right. At least, under simplified models. Pre-fetching, coupled with branch-prediction, has gotten a lot better and more complex over the last decade, making their performance very hard to predict without relying heavily on empirical evidence.

Also, here's another pre-fetching scenario that is not completely unrealistic IMHO. Say you are looking at two (or d) sibling nodes, branch-prediction is likely to figure out that you will be jumping to one of the children groups next (to look at the children of one sibling or the other), and because they are next to each other, the entire group of four (or d^2) nodes can be pre-fetched before knowing which pair you will want to look at next. And that mechanism works anywhere in the tree (because direct "cousins" will always be packed together). So, one block can be pre-fetched, and it will accomodate any branch that you will want to follow after the search conditions (e.g., node comparison) has been evaluated, without ever doing a pre-fetch mistake (unless you quit the traversal early), or having to wait for the condition. I'm not sure how good and sophisticated these pre-fetching with branch-predictions occur. But one thing is for sure, anything beats the linked-list-style storage, because you'll never be able to get such a consistent one-pre-fetch-for-all-branches behavior, regardless of how sophisticated the system is.

Also, if you're interested, here (click on Raw to see the eps file) is a plot of the query-time of my space partitioning versus the number of nodes in the tree. On the plot, there is the linear-search (the smooth straight line, duh!), the space-partitioning stored in a linked-list-style storage (very close to linear-search performance, except at low node-counts), and the exact same space-partitioning but stored in a breadth-first layout (the curve that is logarithmic). The linear search is good at low node-counts, obviously, because it has very little overhead (constant-term) and has the best memory access pattern of all method (linear, in-order memory traversal), but, of course, at higher node-counts the space-partitioning gives orders of magnitude savings with its log-time progression.

I believe the linked-list-style storage results in linear-time because the number of falsely-predicted pre-fetches grows a lot with the spread and size of the number of nodes.

Grouping every 1+d or 1+d+d^2 trees into contiguous blocks should ostensibly improve things, even though you say a COB-tree is not worth it. It's an isolated change to how parent/child indices are computed, so it's do-able, too.

Yeah. That's basically the intent and argument behind the COB-tree (and cache-aware versions too). My implementation is not as oblivious as the method in the papers describing the COB-tree (or even not oblivious at all, now that I think about it), because I simplified the indexing by using fixed blocks, as opposed to a logarithmic recursion of the block sizes, which requires a loop to resolve the indices for the vertices. In my implementation, the indices are closed-form expressions (similar to the `i -> 2 * i + 1` rule for a simple breadth-first layout). In any case, the idea is simply that you have a breadth-first layout to represent a limited-depth sub-tree, whose leaf nodes point to the root of other limited-depth sub-trees, e.g., blocks. So, the idea is to extend the good pre-fetching behavior that you'd get at the top of tree to within every block (or sub-tree). My implementation seems to be about 3 times slower than the straight breadth-first layout (but still log-time, and I think it catches up with the kind of node-counts that max-out my RAM), this is due to the more complicated indexing, but I have yet to test this with larger data and more complex comparisons, in which the indexing overhead might be subsumed. So, that's still an option (and of course, both implementations are completely interchangeable, so I intend to test both).

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, learning, and sharing knowledge.