disposing of a large data structure....

Question

n.aggel 13 Posting Whiz in Training

17 Years Ago

Hi, i have created a data structure {heap based}, capable of manipulating up to 10 million elements....I noticed that i can't go above this number because i have serious memory leaks....

The problem seems to be that i haven't written a destructor!{i a bit used to garbage collection!}

if i write a destructor of this type:

void clean( node *myNode )
{
        if( myNode != NULL )
          {
                clean( myNode>left );
                clean( myNode->right );
                delete myNode;
          }
}

then with so many elements, i will make the data structure even heavier...

so is there any other way to write the destructor??

thank you in advance for your time,

nicolas

c++ data-structure

4 Contributors
14 Replies
223 Views
3 Years Discussion Span
Latest Post 14 Years Ago Latest Post by ani malviya

All 14 Replies

vijayan121 1,152 Posting Virtuoso

17 Years Ago

since a heap is a *balanced* binary tree (it has the 'shape' property), you do not need nodes to implement it. we can store the heap in a compact manner in a (dynamically resizeable) array; this would be much more efficient. see: http://en.wikipedia.org/wiki/Binary_heap#Heap_implementation

vijayan121 1,152 Posting Virtuoso

17 Years Ago

..so basically the problem is if i can free the memory without visiting every node....

since all the nodes are of the same size, you could use a pool allocation (also called "simple segregated storage"). see: http://www.boost.org/libs/pool/doc/concepts.html
boost.pool is a library you could use; it is very efficient. and you could free all chunks in a pool in one go. if the destructor of the node is a trivial one, you can bypass visiting all nodes. if not, you will have to traverse every node (with a little extra overhead: keep a pointer to the next allocated node with each node in the pool), this would be just a sequential, linear traversal.

vijayan121 1,152 Posting Virtuoso

17 Years Ago

> ...what do you mean by saying if the destructor of a node is a trivial one..here is my node:
the destructor of you node is trivial (ie. a destructor is deemed to be declared, but is not synthesized by the implementation).

> and if this is the only way to implement pool, i will have to write the part of the program that instantiates the heap....{everywhere i've use the new operator...}
the solution would be to overload operators new and delete for your node.
here is how you could do it using boost.pool

#include <boost/pool/pool.hpp>
#include <cassert>

struct node // has a trivial destructor
{
  int element ;
  node* left ;
  node* right ;
  int field ;
  static boost::pool<> node_pool ;
  static void* operator new( size_t sz )
  {
    assert( sz == sizeof(node) ) ;
    return node_pool.malloc() ;
  }
  static void operator delete( void* ptr )
  {
    assert( node_pool.is_from(ptr) ) ;
    node_pool.free(ptr) ;
  }
};

boost::pool<> node::node_pool( sizeof(node) ) ;

void allocate_lots_of_nodes_but_do_not_deallocate_them()
{
  // function allocates a large number of nodes
  for( int i = 0 ; i < 1023*1024*16 ; ++i )
  {
    node* pn = new node ;
    // do something with pn; perhaps insert into a tree etc.
    // don't bother to delete
  }
} 

int main()
{
  for( int i=0 ; i<8 ; ++i )
  {
    allocate_lots_of_nodes_but_do_not_deallocate_them() ;

    // frees every memory block. invalidates any pointers previously
    // returned by allocation functions
    node::node_pool.purge_memory() ;
  }
}
/**
>g++ -Wall -std=c++98 -O3 -funroll-loops -I/usr/local/include -DNDEBUG use_pool.cpp
use_pool.cpp: In function `void allocate_lots_of_nodes_but_do_not_deallocate_them()':
use_pool.cpp:30: warning: unused variable 'pn'
>time ./a.out
37.386u 36.108s 1:17.34 95.0%   5+4641k 0+0io 0pf+0w
>uname -sr
FreeBSD 6.2-RELEASE-p7
>dmesg | grep CPU:
CPU: Intel(R) Pentium(R) 4 CPU 2.80GHz (2800.10-MHz 686-class CPU)
*/

takes 67.34 seconds for 128 million allocations and 8 bulk deallocations; almost all of it for allocations. this is on a 2.8GHz P4 on FreeBSD, linux would obviously be much slower.

vijayan121 1,152 Posting Virtuoso

17 Years Ago

> i did what you suggested and i got::....
you have not defined node::node_pool.
see line 23 of the earlier post boost::pool<> node::node_pool( sizeof(node) ) ;

n.aggel commented: excellent poster! +1

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

Hamrick 150 Posting Whiz · Answer 1 · 2007-08-20T19:54:52+00:00

What do you mean by making the data structure heavier? You have to free the memory and to free the memory you have to visit every node. The only heavy thing about the clean function is that it uses recursion. That's a pre-order tree walk so you can make it non-recursive pretty easily with a stack.

void clean( node *myNode )
{
  if ( myNode == NULL ) {
    return;
  }

  stack<node*> st;

  st.push( myNode );

  while ( !st.empty() ) {
    node *temp = st.top();

    st.pop();

    if ( temp->right != NULL ) {
      st.push( temp->right );
    }

    if ( temp->left != NULL ) {
      st.push( temp->left );
    }

    delete temp;
  }
}

n.aggel 13 Posting Whiz in Training · Answer 2 · 2007-08-20T20:21:23+00:00

first of all thank you, for your quick answers

since a heap is a *balanced* binary tree (it has the 'shape' property), you do not need nodes to implement it. we can store the heap in a compact manner in a (dynamically resizeable) array; this would be much more efficient. see: http://en.wikipedia.org/wiki/Binary_heap#Heap_implementation

Actually, it isn't a heap but a leftist heap....so i must construct the tree using pointers....

What do you mean by making the data structure heavier? You have to free the memory and to free the memory you have to visit every node. The only heavy thing about the clean function is that it uses recursion. That's a pre-order tree walk so you can make it non-recursive pretty easily with a stack.

The problem is that using the recursive algorithm i wrote on my first post, i might get out of stack...so basically the problem is if i can free the memory without visiting every node, or constructing a stack in advance...

like:: visit nodeA, delete nodeA, visit nodeB, delete NodeB, ...... so i start from the root and go down to the leaves....

Hamrick 150 Posting Whiz · Answer 3 · 2007-08-20T20:34:48+00:00

so basically the problem is if i can free the memory without visiting every node

You can handle memory by storing it all in one big block and passing out chunks just big enough for each node. Then freeing it turns into just one call to delete. But that doesn't call destructors for all of the nodes, and unless the nodes only contain built in types and no pointers that's why you need to visit every node.

or constructing a stack in advance...

You only run out of stack with the recursion. I posted a function that uses the STL stack adapter non-recursively. It uses dynamic memory so it can grow as much as you need it to.

like:: visit nodeA, delete nodeA, visit nodeB, delete NodeB, ...... so i start from the root and go down to the leaves....

That's what my clean function does. :)

n.aggel 13 Posting Whiz in Training · Answer 4 · 2007-08-20T21:36:43+00:00

since all the nodes are of the same size, you could use a pool allocation (also called "simple segregated storage"). see: http://www.boost.org/libs/pool/doc/concepts.html
boost.pool is a library you could use;
it is very efficient. and you could free all chunks in a pool in one go. if the destructor of the node is a trivial one, you can bypass visiting all nodes. if not, you will have to traverse every node (with a little extra overhead: keep a pointer to the next allocated node with each node in the pool), this would be just a sequential, linear traversal.

This was very interesting...{i hadn't heard it before...}
Some questions::
---Will i have to implement my own allocator? {because i do allocate a lot very little objects....basically my objects are integeres and pointer to nodes with other integers....}
---the documenation was nice, but couldn't manage to find an example, i only found this http://boost.org/libs/pool/doc/interfaces.html
and if this is the only way to implement pool, i will have to write the part of the program that instantiates the heap....{everywhere i've use the new operator...}
---do you have a link to any example?
---finaly what do you mean by saying if the destructor of a node is a trivial one..here is my node:

class myNode
{

	int element;	
	myNode *left;
	myNode *right;
	int field;
};

i think that the destructor has to be simple...NOTE: i am making a destructor only for my tree class, should i also make for the node class?

You only run out of stack with the recursion. I posted a function that uses the STL stack adapter non-recursively. It uses dynamic memory so it can grow as much as you need it to.

i didn't know that, apparently i have to look again memory management...does anyone have a good link to share, or a good book{that i can buy}?

I've used hamrick method {because it didn't require any changes} but the problem remains..... i mean that with under million elements the operation is completed in under second....with 10 million it takes on average 7 seconds, but then the computer goes seriously slow{i guess it must be a memory leak}.....if i go above 10 millions i can do only 1 iteration...on second the copmuter freezes.... here is my testbench{i ve used elements vijayan121 suggested in previous post...}::

enum { HOW_MANY_ELEMENTS = 10000000, TEST_ITERATIONS = 2, DISTRIBUITION_AREA=1024*1024 };
	
	
	
	
		mt19937 engine ;
		uniform_int<> distribution( 1, DISTRIBUITION_AREA ) ; 

	              variate_generator< mt19937&, uniform_int<> > 
				generator( engine, distribution ) ;
     
     
	double make_heap_total = 0.0 ;
	double make_heap_min = numeric_limits<double>::max() ;
	double make_heap_max = 0.0 ;
	double sort_heap_total = 0.0 ; 
	double sort_heap_min = numeric_limits<double>::max() ;
	double sort_heap_max = 0.0 ;

for( int i=0 ; i<TEST_ITERATIONS ; ++i )
{
		LeftistHeap init;	//The test structure

		//For vector randomization	
		vector<int> vec ; 
		vec.reserve(HOW_MANY_ELEMENTS) ;
		generate_n( back_inserter(vec), size_t(HOW_MANY_ELEMENTS), generator ) ;


     		//printVector(v);
		
		cout<<"Initialize "<<i<<" tis Priotity Queue::"<<endl;
		clock_t start = clock() ;
		init.initializeHeap(vec);
    		clock_t end = clock() ;
    		//init.print(1);
    		cout<<"end of Initialization phase "<<i<<" tis Priotity Queue::"<<endl;
    		double duration = elapsed_msecs( start, end ) ;
    		make_heap_total += duration ;
    		make_heap_min = min( make_heap_min, duration ) ;
    		make_heap_max = max( make_heap_max, duration ) ;
	
}


cout << "make_heap - min: " << make_heap_min << "\t  max: " << make_heap_max
        << " average: " << make_heap_total / TEST_ITERATIONS << " millisecs\n" ;

PS: sorry for my long posts, thanks again for your answers!

Hamrick 150 Posting Whiz · Answer 5 · 2007-08-20T22:49:57+00:00

---Will i have to implement my own allocator?

I don't know much about boost.pool, but unless it does all of the work for you then you will have to implement your own allocator.

---finaly what do you mean by saying if the destructor of a node is a trivial one..

If you need to override the default destructor, it's not trivial. :) Your nodes don't even use an overridden constructor because they don't manage resources. So they're trivial and you don't need to call the destructor. You can just free the memory.

but then the computer goes seriously slow{i guess it must be a memory leak}

You shouldn't guess. It could be that something is wrong with your algorithms or you're just doing too much for the machine to handle. If you think it's a memory leak, you should use some kind of memory leak detector on your program. Then if it's not a leak you should profile everything and try to find the bottleneck. Changing things around based on a guess just isn't a good idea. ;)

n.aggel 13 Posting Whiz in Training · Answer 6 · 2007-08-20T23:40:26+00:00

so, if use this techinque then i will have to delete the destructor on my heap class?{this class is responsible of "combining" nodes to form a heap?

vijayan121 1,152 Posting Virtuoso · Answer 7 · 2007-08-20T23:52:48+00:00

if you are going to have only one object of your heap class at a time (ie. you do not create a second heap before destroying the existing one), you could just call boost::pool<>::purge_memory() in your destructor. or you could remove the destructor and call this from outside. or you could have one boost::pool<> per heap, and call boost::pool<>::purge_memory() on it in the destructor. the last option has technical merit, but would require some modification to the part of the code where you allocate/deallocate nodes.

n.aggel 13 Posting Whiz in Training · Answer 8 · 2007-08-20T23:58:29+00:00

i did what you suggested and i got::

g++ LeftistHeap.o main.o -o project
LeftistHeap.o: In function `myNode::operator delete(void*)':
LeftistHeap.cpp:(.text._ZN6myNodedlEPv[myNode::operator delete(void*)]+0x10): undefined reference to `myNode::node_pool'
LeftistHeap.cpp:(.text._ZN6myNodedlEPv[myNode::operator delete(void*)]+0x4e): undefined reference to `myNode::node_pool'
LeftistHeap.o: In function `myNode::operator new(unsigned int)':
LeftistHeap.cpp:(.text._ZN6myNodenwEj[myNode::operator new(unsigned int)]+0x9): undefined reference to `myNode::node_pool'
collect2: ld returned 1 exit status
make: *** [project] Error 1

i think it is a linker error{that it can't find pool}, although i have included

#include <boost/pool/pool.hpp>
#include <cassert>

n.aggel 13 Posting Whiz in Training · Answer 9 · 2007-08-21T01:18:58+00:00

thank you for your help.....

for the record with the destructor working, i managed to get more iterations..

ani malviya 0 Newbie Poster · Answer 10 · 2010-10-05T00:25:08+00:00

can u provide me a basic level notes on DSA till the graphs in data structures...its urgent n m a student of engg 1st year exams ahead need ur help...

disposing of a large data structure....

Recommended Answers Collapse Answers

All 14 Replies

Recommended Answers