Since small object allocation in heap is bad, I am just wondering the range of the object size that is defined to be small.


I'm sure somebody will have a more definitive answer, but the basic problem with allocating small objects from the heap is the amount of overhead necessary to keep track of that object, and the extent to which the available heap memory is fragmented as small objects are allocated and freed in arbitrary order. The exact value of "small" will depend on a variety of factors including the word-size of your operating system (to store the address of a pointer, generally 32 bits or 64 bits), the amount of memory at your disposal, and how much waste you're willing to tolerate.

If you know you want to allocate consistent small objects like integers (not that a savvy developer would generally allocate an integer at a time anyway), you could always allocate a block of memory, big enough to store 100 integers, or 10,000, depending on how many you're likely to need, essentially:

int *ints = new int [100];

and then manage yourself which is which and whether you're using it or not, and knowing when you can safely free up the whole block. The problem is clearly also application-dependent. I do a fair bit of image-generation programming, so I'm always needing to store red, green and blue values (or more), each of which is usually at least an 8-bit integer (and often a float instead) for each pixel in a W-by-H-pixel image -- allocating the entire image in a single block of memory is smarter than, say, creating a linked-list of individually-allocated "pixel" objects. Computation on massive but sparse matrices, on the other hand, tends to favor solutions that allocate only as much memory as is needed to store and manage the non-zero entries. When a 64MB of RAM was a lot, I was more careful about how I allocated memory; now that 4GB is common, I tend to worry less; bounty leads naturally to laziness. :)

Another aspect allocation is how effectively you can re-use objects. A kernel device driver for sending and receiving 48-byte payload data blocks over an ATM network shouldn't allocate and free up those blocks -- instead keep unused blocks on a free-list, and allocate more only when the free-list is empty.

commented: nice +14

Thank you.
Just out of curiosity, you allocate the entire image in a single block of memory. that means you save the memory but increase the computational power, is that worth?

The default malloc() routine in stdlib (going back to C days here) isn't particularly fast, in addition to potential issues with fragmenting memory. So the "normal" way to create a 2D image would be something like:

float ***image = (float ***)malloc(height * sizeof(float**));
for (row=0; row<height; row++) {
  image[row] = (float **)malloc(width * sizeof(float*));
  for (col=0; col<width; col++)
    image[row][col] = (float *)malloc(3 * sizeof(float));  /* 3 channels per pixel */

and freeing up the image would be just as cumbersome, just so you could get at the blue part of the top-right pixel via image[0][width-1][2] Instead, since I know that in addition to the width*height*3 floats, I also need width*height pointers for each pixel and height pointers for each row, I can make a single call to malloc:

image = (float***)malloc(width*height*3*sizeof(float) +
                         width*height*sizeof(float*) +

and then painstakingly fill in the pointers to the correct offsets into the data. Because in this case I know that a float is 32 bits and a pointer (on a 32-bit machine) is also 32 bits, there's no issue with data alignment or padding. And since I allocated it with a single call, I can also free it with a single call: free(image); Whether the savings are worth the costs is entirely dependent on the usage scenario. When you're processing thousands of 4K x 3K images every night, each through several different image-processing steps, spread across as many workstations as you can get your hands on (for the case of a visual effects studio), anything that makes the code run faster and keeps the machine's available resources quicker to access is well worth the little bit of programmer effort. For a one-off, especially where the time to compute a new set of values for each pixel is vastly more than the overhead (such as generating the visual effect source image, maybe 2 hours or more per frame of film), then it doesn't matter so much. :)

>>Since small object allocation in heap is bad

From the perspective of experience, allocating anything from heap memory is bad, whatever size it has. You only do it when it is required. It is required when the size or type is not known at compile-time, or if it's too big to be on the stack.

When it comes to "small objects", we usually mean built-in types (like int, double, etc.) and objects that aren't much bigger than a few times the size of the native word size (i.e. size of an int or a pointer). But, usually, all POD-types (Plain Old Data types) are rarely allocated individually on the heap (not in an array), because they are rarely very big (too big for the stack), their size is known at compile-time (since there is just one), and their type is known at compile-time (i.e. POD-types are not, by definition, part of an inheritance scheme). There are a few exceptions (like shared objects), but they get almost always allocated statically (not on heap).

Finally, C++ has all the containers required to handle variable-size "arrays" in any way you like. And they are optimized to minimize heap allocations.

Thank you guys.
However, I know that malloc() is optimized for medium and large object. So it is not bad to allocate in heap. Is it right?

Moreover, for the containers, if an new obejct is added to them, the copy of that obecjt is also allocated in the heap too?

>>Moreover, for the containers, if an new obejct is added to them, the copy of that obecjt is also allocated in the heap too?

Yes, but containers like vector and deque for example, use methods to avoid having to reallocate memory on the heap again and again. Vector will preemptively allocate more space than necessary (in an exponential way) in anticipation of future additions of elements.

You have to understand that what is most expensive is the actual allocation and deallocation operations. So, the point is mostly to avoid frequent allocations/deallocations (especially in a loop). Static memory allocations (on the stack) are free (basically no overhead at all), while dynamic memory allocations (big or small) are always pretty expensive operations.

>>So it is not bad to allocate in heap. Is it right?

Consider allocations on the heap as a necessary evil, literally. It is necessary in many cases, but it has big penalties that you always want to avoid if it is not necessary to use it. So, it IS bad, but it IS sometimes necessary.

>>However, I know that malloc() is optimized for medium and large object.

Size matters in non-trivial ways, really. And anyways, malloc should not be used for objects anyways, only for (arrays of) built-in types, because malloc is a C function and will not call constructors of objects. When you talk of size, you really mean the overall size of the memory allocation. Allocating large blocks can be very slow because it may often require the heap to fetch additional memory from the operating system, which is super-expensive (a ton of clock-cycles). Allocating small blocks will be much faster because it is unlikely to require additional memory from the OS, however, it contributes to the fragmentation of the heap, and the more fragmented it is the harder it is for the heap to quickly find available memory blocks to allocate.

Long story short, dynamic memory allocation is expensive, in many ways, both for big or small, for malloc() or new (which is usually implemented with a malloc call anyways). But, it is often necessary, hence, again, it's a necessary evil.

thank you. I understood.