0

Hi,
can anyone tell me why i can't clean up the memory,
and maybe a small solution.

I'm allocing memory with strdup for each key,
but I cant figure out how to erase the key with free.

Before anyone starts talking about mixing c++ and c,
I know it's generally a bad idea but std::strings are to slow.

#include<iostream>

#include<cstring> //for cstring functions
#include<map> //associative array
#include<cstdlib> 


struct cmp_str {
  bool operator()(const char *a,const  char *b) {
    return std::strcmp(a, b) < 0;
  }
};

void printMap(std::map<const char*,int,cmp_str> &asso){
  for(std::map<const char*,int>::const_iterator it = asso.begin(); it != asso.end(); ++it){
    printf("%s:\t %d\n",it->first,it->second);
  }
}



std::map<const char*,int,cmp_str> build(){
  std::map<const char*, int,cmp_str> asso;
  
  const char *text = "does is work";
  char *tmp=NULL;
  tmp = strtok(strdup(text)," \t");
  
  while(tmp!=NULL){
    asso[strdup(tmp)] = 1;
    tmp=strtok(NULL," \t");
  }
  return asso;
}

void cleanup(std::map<const char*,int,cmp_str> &asso){
  std::map<const char*,int,cmp_str>::iterator it;
  asso.erase(asso.begin(),asso.end());
}


int main(int argc, char** argv){
  std::map <const char*, int,cmp_str> asso = build();
  printMap(asso);
  printf("should return 1: %d\n",asso["is"]);
  printf("should return 0: %d\n",asso["0"]);
  cleanup(asso);
  return 0;
}
4
Contributors
13
Replies
14
Views
7 Years
Discussion Span
Last Post by ArkM
0

Erasing the map elements will remove the containers but it won't 'free' the memory you allocated for the objects contained within them.

0

Yes I know,
this is what I write in my original post.

How do I free the memor, that I've allocated.

0

For anyone that cares,
this seems to do the trick.
But it is important that the item decleration is used directly at the assignmen operator

[TEX=c++]
my_map::value_type item = *it;
data.erase(it);
free (item.first);
[/TEX]

0

It's so easy:

typedef std::map<const char*,int,cmp_str> Map;
void cleanup(Map& m)
{
  for (Map::iterator i = m.begin(); i != m.end(); i++)
        free(const_cast<char*>(i->first));
  m.clear(); 
}

Be careful: this map must have keys allocated by C-functions strdup or malloc.

0

Cheers arkM,

I have no idea what happened here...

But the const_cast is not really necessary, is it?

0

here... where?

const_cast is necessary because iterator->first yields a pointer that is not suited for the function free ( map<const char*... ).

If you don't understand my code:

We must deallocate map key memory chunks obtained from strdup(s). The for loop iterates all mapped elements and deallocates these memory chunks. Now we can clear the map - that's all.

What's a problem?

0

Thanks for your response.

Is the const_cast necessary because i definded my map as a

typedef std::map <const char*,int> aMap;

instead of

typedef std::map <char*,int> aMap;

Having a key as const in a map is to prefere right?

thanks again

0

>Is the const_cast necessary because i definded my map as a ...
Yes.
>Having a key as const in a map is to prefere right?
It does not matter in your case.
I don't undestand why you are so disturbing on const_cast ;)

0

>Before anyone starts talking about mixing c++ and c,
>I know it's generally a bad idea but std::strings are to slow.
It's not generally a bad idea, but I'm curious why you think std::strings are too slow. Have you written your code using them, profiled it, and determined that the biggest bottleneck is inside the string class? Or are you making a conclusion based on subjective observation and/or hearsay?

0

I'm parsing 900 gig of textfiles
Using std::string as opposed to char* directly from getline,
takes around 40-60% longer.

0

If you see to the fastest code:

1. Avoid using expensive malloc/strdup for every new key string value. Place new C-strings to preallocated buffers chain (or even into std::string with a proper capacity) including terminating zero byte then use a pointer to the string as map key (no troubles with key deallocations)

2. Allocate large buffer fo process file (use std::filebuf and putsetbuf member to assign large preallocated buffer) or use C i/o library functions directly. As usually, default file stream buffers are too small for bulk file processing.

3. std::map (intrusive) container is a good thing but optimized AVL-tree implementations as usually are much more faster (that's proved ;))...

std::string is not a bottleneck, see points above (acceleration factor ~3-5 or more)...

0

Cheers arkM,

I will for sure look into this.

The avl is not part of the stl right?
So I have to make this one up.


As it is, the lookup map, takes aprox 3-4 gig memory,
I have 8 gig, but this is really the limit.
35million keys, each with a int[5] as value.

So i really can't just malloc huge amounts.
Or maybe I dont understand you...

The files I read are flatfiles, same amount of columns on each line,
so you are saying I should just buffer huge huge amounts.
And then manually tokenize by newlines, and then item seperate it?

Thanks for your advice,

0

AVL-trees: there are some good freeware implementations in C and C++ (search Goodle). Think: std::map with char* key is not too slow.
Memory: 35 millions keys * ~12-30 bytes (AVLT node size) = ~0.4-1 Gb - that's all.
String buffer: you allocates much more memory for std::map; 35 millions of malloc/new - too expensive if you see the fastest code.
Buffering: it's not so expensive (~64 Kb is enough)
Tokenizing: try and compare line parsing with operator >> and a simple scanner code (I'm sure that the last one is much more faster).

Of course, the right way to go:
- make prototype with STL (may be it has good characteristics)
- profile application
- step by step replace parts mentioned above
- test application, compare characteristics

This topic has been dead for over six months. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.