Hi All,

For each element in data1, I need to figure out what elements in data2 are related to it. Also, for each element in data2, I need to figure out what elements in data1 are related to it. Therefore, I setup a mutual data structure as you can see below:

class data1 {
  // class variables
  int id;
  .....

  map<string, float> m_to_data2; // string is the keyword of data2, float recorded the reference info when data1 and data2 are related.
};

class data2 {
  // class variables
  .....
  list <int> src_id;  // int is the id of data1
};

map<string, data2 *> map_2;
map<int, data1 *> map_1;

Then I parse file and fill the map_1 and map_2. I found:
(1) the total memory usage after setting up the mutual-linked two maps: 498.7M.
(2) without set the link from data2 to data1 (not fill list <int> src_id), memory usage: 392.7M.
(3) Without fill map_1, without fill list <int> src_id in data2, memory usage: 182.0M
(4) Without fill map_1, fill list <int> src_id with ids of data1, memory usage: 289.7M
(5) without fill map<string, float> m_to_data2, memory usage: 290.0M

(6) The size of map_1: 77737
(7) The size of map_2: 1830009
(8) The size of map<string, float> m_to_data2 for each element of map_1 in the range of 3 - 17522
(9) The size of list <int> src_id for each element of map_2 in the range of 1- 1377

I need to reduce the memory usage after setting up the mutual-linked maps (ideally less than 200M, currently 498M as you can see above). I was trying to token the string (keyword of data2) to int by setting up an extra map <string, int>, since int needs less memory than string, but it may not help much since I need extra memory for the map <string, int>. Any suggestions?

Your comments/suggestions are highly appreciated.
Thank you very much.

I think you're over-complicating this. Though it would help if you could be more specific about how data1 and data2 would be related. What kind of data are we working with? The choice of how to work with it does depend somewhat on how it's being used.

Hi Narue,

Sorry for the confusing. Basically, what I need to is to setup the mutual-link relation between data1 and data2. I mean for each element of data1, I need to find all its related elements in data2, and vice verse.
Any suggestions on this?

That's just another way of saying what you've already said. What does this relation represent? data1 and data2 are unhelpful names, and "related elements" is equally unhelpful. What I want to know is what problem are you trying to solve with this setup?

saying data1 represents the manufacture companies, and data2 represents the supplier companies. Each manufacture company has multiple suppliers, and each supplier does business with multiple manufacture companies.
Giving a manufacture company, I need to find all its suppliers efficiently for price study. Giving a supplier company, I also need to find all its customers.
Now I just want to setup such a database so that the search in both directions is efficient.
Thanks.

This article has been dead for over six months. Start a new discussion instead.