I have to read and store 70 millions of Double objects in a Vector. The problem is that it seems that Java will allocate 40 bytes of memory for each Double object and the memory amount we need to store such number of Doubles object is approximately 2.8 GB. It is very strange that a simple object like Double cost 40 bytes of memory. I am new in Java, before I have never faced the same issue with C++. Could you please let me know how large is the size of an object in Java?

best regards,

If memory usage is a big issue in your app you should use an array of double primitives, not a Vector of Double Objects. doubles in an array occupy 64 bits each.

Thank you for quick replies. My work always requires big amount of data, 70 millions is just part of the dataset that I am currently working with. Sometime I need to store the data in some other containers like LinkedList, HashMap so the solution with array of double maybe not applicable. I wonder if in Java they allow to store a double instead of an object of Double in the forgoing containers. I have tried but it didn't allow to do that. Is there any hint for this situation? Or do I need to switch back to C++?

Database engine is used when you have something to store for querying. In my work I am dealing with stream data, usually they are measurements from sensors network with thousands of sensors. Each sensor produce GBs of data per day, therefore, even storing data of a day all in a single disk is a big issue. One-pass scanning though the data is already costly, it is almost impossible to scan through the data hundreds of times. We have to scan through the data once and build the data statistics in a summary which is small compared to the entire data. Database engine will help to store the summary but never helps to build it.

According to this article, a Vector in Java takes 80 bytes and each Double takes 16. Unfortunately it is not possible to store a primitive type in a collection like LinkedList or HashMap. The best (most memory efficient) route, as JamesCherrill suggested, is to use an array of primitive types. Perhaps you could write your own collection class that stores the internal data in an array?

yes, that's basic runtine, on the ritchest enviroments, and about internal rulles

each "your Sensor" just write data to local Db, then Sql engine collecting data, cleanUp balast, periodically sent only important data, then from 4-6Gb per day/enviroment you'll get snapShot with all desired data from virtual at a few Mb (in db size), with low impact to local intranet traffic, DiskArea(s) usage, procesors performance == electrocity consuptions

Thank you Kramerd, I think it depends on the system we are using the running time size of a Double maybe more. I have tried a vector of Double and an array of Double, both need 400Mb for 10 million Double objects (40 bytes per element). Anyway I will switch back to C++ and use STL for memory-intensive application, only use Java with smaller dataset. Hope that in the future Java will support collection of primitive type instead of reference type as at the current moment. Anyway thank you all for your useful discussion.

This article has been dead for over six months. Start a new discussion instead.