We're a community of 1076K IT Pros here for help, advice, solutions, professional growth and fun. Join us!
1,075,996 Members — Technology Publication meets Social Media

# Heap Data Structure - Removing an arbitrary item

Hey,

I'm studying for a final exam and I came across an old question asking to give an algorithm that runs in [TEX]O(logn)[/TEX] and deletes an arbitrary element from an [TEX]n[/TEX] element Min-Heap if given its index.

I was thinking the right approach would be to use Hash table for direct access in [TEX]O(1)[/TEX] time and then delete the element in [TEX]O(logn)[/TEX] time and maintaining the heap property.

I don't really know how to go about writing an algorithm that does this though, so any help would be really appreciated.

Thanks

2
Contributors
10
Replies
4 Days
Discussion Span
1 Year Ago
Last Updated
12
Views
Question
blackrobe
Junior Poster in Training
52 posts since Oct 2008
Reputation Points: 10
Skill Endorsements: 0

I think you misunderstand the question. What are you using the hash table for?

Rashakil Fol
Super Senior Demiposter
Team Colleague
2,732 posts since Jun 2005
Reputation Points: 1,153
Skill Endorsements: 25

Well hash table would be used to speed up searching for the item in the heap. I thought that searching would take logn and deletion logn taking into account any changes done to maintain the heap property.

blackrobe
Junior Poster in Training
52 posts since Oct 2008
Reputation Points: 10
Skill Endorsements: 0

If a heap is stored in an array, you can access the element at a given index in constant time. I think this is what the question assumes. I don't know what it would mean by "its index" if the heap is stored using an old fashioned tree with nodes and child pointers.

Rashakil Fol
Super Senior Demiposter
Team Colleague
2,732 posts since Jun 2005
Reputation Points: 1,153
Skill Endorsements: 25

If a heap is stored in an array, you can access the element at a given index in constant time. I think this is what the question assumes. I don't know what it would mean by "its index" if the heap is stored using an old fashioned tree with nodes and child pointers.

Here's the exact question:

Describe an O(logn) time algorithm for deleting an arbitrary element for an `n` element heap, given the location of the element in the heap. The remaining elements must form a valid heap once you have executed your algorithm.

On a higher level, i.e. ignoring the exact locations of a node and its parent, I wrote this:

``````Delete(A, index):
[INDENT]while A[index] > A[parent(index)]:
[INDENT]parent = A[parent(index)]
A[parent(index)] = A[index]
A[index] = parent[/INDENT]
return Extract-Min();[/INDENT]``````

This would switch up the element to be removed with its parent until its placed at the very top, at which point the heap operation Extract-Min pops the element and restructures the heap.

Not sure if thats O(logn) though.

blackrobe
Junior Poster in Training
52 posts since Oct 2008
Reputation Points: 10
Skill Endorsements: 0

Why aren't you sure it's O(log n)? (It seems totally obvious to me, of course I'm very familiar with this sort of thing.)

Specific questions I have to probe your understanding:

- Do you know the running time of Extract-Min? What is it?

- Do you know what shape a heap's tree looks like?

- Do you know the depth (i.e. height) of a balanced binary tree?

Rashakil Fol
Super Senior Demiposter
Team Colleague
2,732 posts since Jun 2005
Reputation Points: 1,153
Skill Endorsements: 25

- Do you know the running time of Extract-Min? What is it?
I believe this operation would take around O(logn) time since after extracting the min element, the heap has to rebalance itself by looking for next min element.

- Do you know what shape a heap's tree looks like?
Heap tree is a binary tree fully balanced for internal nodes at least and fills up the leaves from left to right.

- Do you know the depth (i.e. height) of a balanced binary tree?
Logn height? As its splitting by a power of two at each level. I mean n/2 level 1, n/4 level 2, n/8 level 3, and so on...at the very end (leaves level) it wont split and will only contain leaves that is 1 node. So for a general level I, n/2^i = 1 which taking logs would give i (the max depth at this point) equal to logn.

blackrobe
Junior Poster in Training
52 posts since Oct 2008
Reputation Points: 10
Skill Endorsements: 0

So it seems like you understand most things, but you are thinking somewhat fuzzily and imprecisely. You say "around O(log n)", you count some leaf nodes as internal nodes in your answer to the second question, and you said a balanced binary tree is log n height with a question mark.

So I have to wonder these things: Why are you unsure? Why are you saying "around" O(log n)?

Why can't you compute the running time of that function?

is what I am wondering. If you know the height of a balanced binary tree, you should be able to compute the running time of the while loop. If you know the running time of Extract-Min, you should be able to compute the running time for the entire function. If you don't know what running time means, or don't know the math, you need to learn that.

Rashakil Fol
Super Senior Demiposter
Team Colleague
2,732 posts since Jun 2005
Reputation Points: 1,153
Skill Endorsements: 25

So it seems like you understand most things, but you are thinking somewhat fuzzily and imprecisely. You say "around O(log n)", you count some leaf nodes as internal nodes in your answer to the second question, and you said a balanced binary tree is log n height with a question mark.

So I have to wonder these things: Why are you unsure? Why are you saying "around" O(log n)?

Why can't you compute the running time of that function?

is what I am wondering. If you know the height of a balanced binary tree, you should be able to compute the running time of the while loop. If you know the running time of Extract-Min, you should be able to compute the running time for the entire function. If you don't know what running time means, or don't know the math, you need to learn that.

Oh yes I counted some as internal nodes at depth (i - 1) when they could have been leaves in fact, in addition to leaves at depth i.

Moreover, I somewhat find the way in which the depth (height) relates to the number of nodes at each level confusing to me. Perhaps because I think of how the Heap is represented both as an array and as a tree. But I know that with every deeper i, the Heap splits children such that its got 2^i nodes at each i.

The while loop runs in O(log n) time because the worst case depth an element could have is the whole height of the tree. Therefore, O(log n) swaps with parents, Increasing its priority, and raising it up in the tree till it becomes the root. At which point, the while loop terminates and the min extracted also in O(log n) time because after the min is popped, we look for a successor to be root, i.e the new min which in the worst case can be at the deepest level in the tree.

So O(log n) + O(log n) = O(log n) time in total.

blackrobe
Junior Poster in Training
52 posts since Oct 2008
Reputation Points: 10
Skill Endorsements: 0

> Moreover, I somewhat find the way in which the depth (height) relates to the number of nodes at each level confusing to me. Perhaps because I think of how the Heap is represented both as an array and as a tree. But I know that with every deeper i, the Heap splits children such that its got 2^i nodes at each i.

Okay. It seems to me that you get it. At the end of this post I will show an exact calculation of the heap's height.

> The while loop runs in O(log n) time because the worst case depth an element could have is the whole height of the tree.

Also there's implicit in your argument the fact that the body of the while loop takes O(1) time. It's perfectly reasonable that you omitted this obvious fact in writing down your explanation. This and the rest of your argument are correct.

Now here's how I would calculate the exact height of the complete binary tree. Let's first note that the first k levels of the tree can hold 1, 2, 4, 8, ..., 2^(k-1) nodes, respectively. Then the first k levels of the tree can hold a total of 1 + 2 + 4 + 8 + ... + 2^(k-1) nodes. That sum equals 2^k - 1. So, if we have a complete tree with n nodes, where 2^k <= n < 2^(k+1), the first k levels of the tree cannot hold all the nodes, and the first k + 1 nodes suffices. A complete tree will thus have k+1 levels when log_2 (2^k) <= log_2(n) < log_2(2^(k+1)), that is, when k <= log_2(n) < k + 1. Note that log_2 is the log base 2 function. If we take a value x such that k <= x < k + 1, where k is an integer, then k = floor(x). So k + 1 = floor(x) + 1, or in this case, floor(log_2(n)) + 1.

So the number of levels of a complete tree with n elements is floor(log_2(n)) + 1.

Usually, instead of computing the inequalities all carefully, I prefer to just take some examples and notice the pattern.

``````n    #levels
1    1
2    2
3    2
4    3
5    3
6    3
7    3
8    4
9    4
...
15   4
16   5
...``````

It's pretty clear it increases by 1 every time you hit a power of two, and you'll want to invert the 2^x function, and playing around with guesses like log_2(n) and floor(log_2(n)) and ceil(log_2(n)), you can fix up the right formula that way. Of course "noticing a pattern" is not a proof, just a good way to find the answer, or to check the answer you've found by other means.

Rashakil Fol
Super Senior Demiposter
Team Colleague
2,732 posts since Jun 2005
Reputation Points: 1,153
Skill Endorsements: 25

> Moreover, I somewhat find the way in which the depth (height) relates to the number of nodes at each level confusing to me. Perhaps because I think of how the Heap is represented both as an array and as a tree. But I know that with every deeper i, the Heap splits children such that its got 2^i nodes at each i.

Okay. It seems to me that you get it. At the end of this post I will show an exact calculation of the heap's height.

> The while loop runs in O(log n) time because the worst case depth an element could have is the whole height of the tree.

Also there's implicit in your argument the fact that the body of the while loop takes O(1) time. It's perfectly reasonable that you omitted this obvious fact in writing down your explanation. This and the rest of your argument are correct.

Now here's how I would calculate the exact height of the complete binary tree. Let's first note that the first k levels of the tree can hold 1, 2, 4, 8, ..., 2^(k-1) nodes, respectively. Then the first k levels of the tree can hold a total of 1 + 2 + 4 + 8 + ... + 2^(k-1) nodes. That sum equals 2^k - 1. So, if we have a complete tree with n nodes, where 2^k <= n < 2^(k+1), the first k levels of the tree cannot hold all the nodes, and the first k + 1 nodes suffices. A complete tree will thus have k+1 levels when log_2 (2^k) <= log_2(n) < log_2(2^(k+1)), that is, when k <= log_2(n) < k + 1. Note that log_2 is the log base 2 function. If we take a value x such that k <= x < k + 1, where k is an integer, then k = floor(x). So k + 1 = floor(x) + 1, or in this case, floor(log_2(n)) + 1.

So the number of levels of a complete tree with n elements is floor(log_2(n)) + 1.

Usually, instead of computing the inequalities all carefully, I prefer to just take some examples and notice the pattern.

``````n    #levels
1    1
2    2
3    2
4    3
5    3
6    3
7    3
8    4
9    4
...
15   4
16   5
...``````

It's pretty clear it increases by 1 every time you hit a power of two, and you'll want to invert the 2^x function, and playing around with guesses like log_2(n) and floor(log_2(n)) and ceil(log_2(n)), you can fix up the right formula that way. Of course "noticing a pattern" is not a proof, just a good way to find the answer, or to check the answer you've found by other means.

Thanks a lot for the clear calculation. Seems so much simpler now. I can prove correctness by induction on k.

I totally agree, that when it comes to finding a solution to a problem I often find myself trying out a few examples and looking for a pattern. Then formulate a solution.

This is also done in many textbooks on algorithms, however the majority merely show the final guesses with minimal or ambiguous explanation of how a solution was reached, not to mention the "jumps" in their calculations by referencing theorems and equations presented 10 or more previous chapters.

Thanks for your help, very much appreciated.

blackrobe
Junior Poster in Training
52 posts since Oct 2008
Reputation Points: 10