Optimizing a binary tree before insertion.

Question

falconmick 0 Newbie Poster

12 Years Ago

Now, I know the first thing I am going to be told is that I should add a balance function to my binary tree, well I have asked the group member who created our tree to do it, but he hasn't got back to me yet. So rather than wait and have him possibly come back and say that he can't, I've decided to try and solve the problem myself, without touching his tree.

Ok, I am making a Binary tree which holds 62500 pointers to a height map so that during our game loop we will optimise the data retrieval time than if we just used an array of mesh's. The only problem, is that I create the trees in order, thus entering them into a tree strait away would form a link list, which is as slow as an array, so there would be no point. What I was wondering, is there some sort of algorithm I can run which will choose a mesh and add it to the tree (on creation it is stored in an array, which is slower than a btree, but if we placed into a btree on creation it would be as slow as an array, so there is no point) so that the tree is automatically balanced. I was having a fiddle trying to make my own algorithm, but a month strait uni assignments has fizzled my brain a fair amount...

by auto balance i mean, it will start from the middle, then go to the middle of the middle and the end, then the middle of the new middle and the end, and so on so forth, filling all the the branches so the depth of the tree is minimal

Thanks for the help you guys

binary binary-tree c++

Edited 12 Years Ago by falconmick because: n/a

4 Contributors
15 Replies
436 Views
1 Day Discussion Span
Latest Post 12 Years Ago Latest Post by mike_2000_17

All 15 Replies

Narue 5,707 Bad Cop

12 Years Ago

If you can suffer the extra storage cost while building the tree, place all of the pointers into an array initially, shuffle the array randomly, then do the insertion:

{
    std::vector<mesh_t> v(62500);

    std::fill_n(v.begin(), v.size(), next_mesh());
    std::random_shuffle(v.begin(), v.end());

    for (auto mesh : v)
        tree.add(mesh);
}

Given a basic binary search tree, your best bet is simply to ensure that the items are inserted in a reasonably random order. This produces the average case of a well (if not ideally) balanced tree.

Narue 5,707 Bad Cop

12 Years Ago

Put simply, if your data can arrive in sorted order, a basic binary search tree is the wrong data structure. You can certainly make it work, but there's a cost. Since high performance is clearly one of your goals, my recommendation is do it right from the beginning rather than try to "fix" a broken design, even if you have to write your own balanced tree (though your group member shouldn't have trouble doing it).

Edited 12 Years Ago by Narue because: n/a

mrnutty 761 Senior Poster

12 Years Ago

If you can suffer the extra storage cost while building the tree, place all of the pointers into an array initially, shuffle the array randomly, then do the insertion:
{
    std::vector<mesh_t> v(62500);

    std::fill_n(v.begin(), v.size(), next_mesh());
    std::random_shuffle(v.begin(), v.end());

    for (auto mesh : v)
        tree.add(mesh);
}
Given a basic binary search tree, your best bet is simply to ensure that the items are inserted in a reasonably random order. This produces the average case of a well (if not ideally) balanced tree.

I really don't know if thats a better solution than just inserting it into a tree, unless I missed something.
If OP really wants to do operations before insert, then sort the mesh by some comparison, then start adding the median into the tree one by one.

Another thought is how about using std::map? Its usually created via red-black trees, which has good balancing.

Narue 5,707 Bad Cop

12 Years Ago

I really don't know if thats a better solution than just inserting it into a tree, unless I missed something.

What happens when you insert a sorted sequence of items into a binary search tree?

Edited 12 Years Ago by Narue because: n/a

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

falconmick 0 Newbie Poster · Answer 1 · 2011-09-25T19:09:00+00:00

Well, I was considering that, but that means that on some runs on certain days, it might lock up when it traverses 62500 nodes as some how it has randomed it into a link list, I know it won't happen, but there are possibilities for spikes and in games dev if there is a way to have a more optimal system you have to or you will have to sacrifice something else, I might just have to learn my group members coding and make a sort algorithm for it, I am currently skimming my data abstractions and algorithm book for it, but I really would prefer to be able to make a recursive function which would half the remaining from the start of the insert scope to the end, I have the algorithm half nailed down, its just been a long day and I'm not thinking.

falconmick 0 Newbie Poster · Answer 2 · 2011-09-25T19:13:21+00:00

Just to emphasis the commitment I have to 100% optimisation, I am considering going through manually and figuring out the perfect balance, putting it into a .csv and loading it in every time :P lol

if anyone has an idea on an easy way to make a .csv with binary tree balanced values from 0-62499 im all ears

falconmick 0 Newbie Poster · Answer 3 · 2011-09-25T19:45:38+00:00

Ok then, Ive done some research, a DSW algorythm looks like it would be simple enough... Only prblem, the code ive found has been writen in C... yes, I know C works in C++... I just don't like C :P ive thought of how I could convert into C++... but it would be a long and painfull task, as I would probably miss spell something or miss an *, and the whole thing would fall apart...

Can anyone recomend a BST which supports DSW that isn't under copy right that I would be able to use (its for an assignment, so I would just site the autor, thats why I need to know its not under ownership, ide rather get 0 marks for a bst thats fast than 0 marks for one thats mind and doesn't work) currently the one we are using you simply specify the data int the BST through a template and then I have made an == and a = and a > operator overide.. So, does anyone know of one?

mike_2000_17 2,669 21st Century Viking Team Colleague Featured Poster · Answer 4 · 2011-09-25T23:06:20+00:00

Maybe I missed something here, but if you have a sorted sequence of items, isn't it super easy to determine the BST order of insertion? You insert the middle item (median) first, then the median of each half and the median of each quarter and so on, so forth. But, of course, if you have a sorted sequence already, wouldn't you just keep that sequence and use a binary-search algorithm on it, instead of constructing a BST? Of course, if you don't have a random-access sequence, you might want to either transfer that sorted sequence into a random-access container, or build a BST (where each indexing operation during the insertions into the BST would be in linear-time, but you only do that BST construction once).

Also, you have mentioned that the application is for looking up meshes distributed over a heightmap, that suggest a geometric structure, in which case a simple BST is probably not the best option. Typically people use binary space-partitioning techniques for that. Easy methods for cartesian spaces include BSP trees, quadtrees, octrees, or Kd-trees, and for more general metric-spaces, you can use a dynamic vantage-point tree (DVP-tree) (I implemented a DVP-tree in a day or so, it's not hard).

mrnutty 761 Senior Poster · Answer 5 · 2011-09-26T04:35:19+00:00

What happens when you insert a sorted sequence of items into a binary search tree?

Thats the thing though, who says his items are being inserted in sorted order? Wouldn't insertion for him be just be 'as random'?

Narue 5,707 Bad Cop Team Colleague · Answer 6 · 2011-09-26T05:57:27+00:00

Thats the thing though, who says his items are being inserted in sorted order?

He did, in the first post.

mrnutty 761 Senior Poster · Answer 7 · 2011-09-26T06:13:48+00:00

He did, in the first post.

Oh i see, your right. Anyways, as suggested, inserting the median would produce the most optimal tree for that case. I don't see why OP cannot do that.

falconmick 0 Newbie Poster · Answer 8 · 2011-09-26T13:11:25+00:00

I really don't know if thats a better solution than just inserting it into a tree, unless I missed something.
If OP really wants to do operations before insert, then sort the mesh by some comparison, then start adding the median into the tree one by one.
Another thought is how about using std::map? Its usually created via red-black trees, which has good balancing.

if this is true about std::map using a rb tree I will probably scrap my tree and use std::map

I ended up getting bored and making my own DSW tree (well I took a basic DSW tree and modified it to be neater and to work with objects (it was written in C)

falconmick 0 Newbie Poster · Answer 9 · 2011-09-26T16:10:58+00:00

I really don't know if thats a better solution than just inserting it into a tree, unless I missed something.
If OP really wants to do operations before insert, then sort the mesh by some comparison, then start adding the median into the tree one by one.
Another thought is how about using std::map? Its usually created via red-black trees, which has good balancing.

So, I have ordered data... Will either:
a. map be faster
b. map and balanced tree will be pretty much as fast as each other
c. balanced tree will be faster

Also, can anyone link me to any documentation of which VS2010 would use for the std::map, I want to ensure that it is right for the data I have, I asked my lecturer if using std::map was a good idea and he was like *grumpy face* you should never use code you can't see's implimentation because It might not be what you need... still, I 'm willing not to care what he says if I can find out for myself that a sorted map will be as fast as a blanced BST...

Narue 5,707 Bad Cop Team Colleague · Answer 10 · 2011-09-26T19:11:15+00:00

So, I have ordered data... Will either:
a. map be faster
b. map and balanced tree will be pretty much as fast as each other
c. balanced tree will be faster

You'll have a difficult time beating the performance of std::map (or std::set if the value and key are the same object). Implementations typically use a red black tree as the underlying data structure of std::set and std::map (a balanced tree is practically required), and they're written by very good programmers.

he was like *grumpy face* you should never use code you can't see's implimentation because It might not be what you need...

He sounds like a complete idiot.

mike_2000_17 2,669 21st Century Viking Team Colleague Featured Poster · Answer 11 · 2011-09-27T01:48:54+00:00

>>you should never use code you can't see's implimentation because It might not be what you need.

That's idiotic, as Narue said. The interface, the behaviour and the performance of each operation on std::map or std::set are very strictly specified, and their performance specifications (time-complexity of insertion (logN) and look-ups (logN)) implies that a balanced BST must be used in the implementation. When you have good specification of the interface, behaviour and performance, there is no need to be aware of its implementation details to be able to use it and have a clear idea of the resulting behaviour.

Optimizing a binary tree before insertion.

Recommended Answers Collapse Answers

All 15 Replies

Recommended Answers