This article shows you how to optimize memory allocations in C++ using placement new. Applicable especially to usecases dealing with creation of large number of objects.

A brief on problem:
Lets take example problem for simplicity:
- I have a GSM network with a cell in it.
- Cell's coverage area is divided into a pixel grid with dimensions 2000x2000.
- at each pixel the signal strength is measured.
Requirement is to find out the % of pixels in a cell that have below par signal strength.

Solution 1 using the good old C:
In C, memory is dealt in raw. You want memory, allocate the number of bytes you want. No guarantees are give regarding the contents of that memory (using malloc()). At best you can initialize the whole block of memory you allocated with "zeros" (calloc() or memset()), but not to a value specific to your application (not unless you REALLY go deep into how your specific compiler creates the memory map for your data).

// Solution 1
#include <stdio.h>
#include <stdlib.h>
#define GRID_WIDTH 200
#define GRID_HEIGHT 200
#define MIN_SIGNAL_STRENGTH 3

typedef struct pixelStruct {
    int _x;
    int _y;
    int _signal_strength;
} t_pixel;

void print_pixel( const char* msg, const t_pixel* ptr ) {
    cout << msg << endl << "\tptr = 0x" << hex << ptr << ", x = " << ptr->_x << ", y = " << ptr->_y << ", signal_strength = " << ptr->_signal_strength << endl;
}

int main() {
    t_pixel* pixelArray[GRID_WIDTH][GRID_HEIGHT];

    srand ( 21346 );
    int w, h;
    // create the pixels..
    for ( w = 0; w < GRID_WIDTH; w++ )
        for ( h = 0; h < GRID_HEIGHT; h++ )
            pixelArray[w][h] = malloc(sizeof(t_pixel));

    // see what the memory looks like just after allocation.
    print_pixel("after allocation, before initialization", pixelArray[0][0]);

    // initialize the pixels..
    for ( w = 0; w < GRID_WIDTH; w++ ) {
        for ( h = 0; h < GRID_HEIGHT; h++ ) {
            t_pixel* ptr = pixelArray[w][h];
            ptr->_signal_strength = rand()%10;
            ptr->_x = w;
            ptr->_y = h;
        }
    }

    // see what the memory looks like after initialization.
    print_pixel("after allocation AND initialization", pixelArray[0][0]);

    // calculate the % of pixels with below par signal strength
    int numPixelsWithLowSignal = 0;
    for ( w = 0; w < GRID_WIDTH; w++ ) {
        for ( h = 0; h < GRID_HEIGHT; h++ ) {
            if ( pixelArray[w][h]->_signal_strength < MIN_SIGNAL_STRENGTH )
                ++numPixelsWithLowSignal;
        }
    }
    
    cout << (numPixelsWithLowSignal * 100 / (GRID_WIDTH*GRID_HEIGHT)) << "% of pixels have below par signal strength." << endl;

    // free up the memory
    for ( w = 0; w < GRID_WIDTH; w++ )
        for ( h = 0; h < GRID_HEIGHT; h++ )
            free(pixelArray[w][h]);

    return 0;
}

/*
// OUTPUT:
after allocation, before initialization
        ptr = 0x3e3f48, x = 4063800, y = 4063800, signal_strength = 0
after allocation and initialization
        ptr = 0x3e3f48, x = 0, y = 0, signal_strength = 9
29% of pixels have below par signal strength.
*/

Some basics that we can take away from this code are:
1. malloc() only allocates memory. It knows NOTHING about how the memory would be actually used and thus returns a void*. We are using it as pixel* in our case. So each call to malloc(sizeof(pixel)) just marks 12 bytes## (4 bytes for each int member variable) as allocated. It can-do/does NOTHING to contents of the memory it marks.
2. In second loop we initialize each pixel and give it the correct coordinate and a random signal strength.

As the output of the code shows, x, y and signal have some random values before they are initialized in the second loop.

So to summarize: Creation of each pixel can be divided into two parts:
1. memory allocation: pixel* ptr = malloc(sizeof(pixel)); // memory allocation. 2. initialization: ptr->_signal_strength = rand()%10; // initialization ptr->_x = w; // initialization ptr->_y = h; // initialization Solution 2 using C++ new operator()
C++ brings to us two new concepts (compared to C) relevant to our problem.
1. One of the basic improvements we had in C++ classes vs. C structs in the the the concept of constructors**. So that you can initialize the contents of the allocated memory in a more organized manner.
2. operator new() (instead of malloc).

Simple C++ solution doing the same as previous C code would look like this:

// Solution 2
#include <Common.h> // attached to article
#define GRID_WIDTH 2000
#define GRID_HEIGHT 2000
#define MIN_SIGNAL_STRENGTH 3

class pixel {
public:
    pixel(int x, int y, int signal_strength) {
        _signal_strength = signal_strength;
        _x = x;
        _y = y;
    }

    // C'tor above can also be written using c'tor inializers
    // Which actually is faster! Syntax:
    // pixel(int x, int y, int signal_strength) :  _signal_strength (signal_strength), _x(x), _y(y)
    // { }

    int _signal_strength ;
    int _x;
    int _y;

    void print_pixel(const char* msg) const {
        cout << msg << endl << "\tptr = " << hex << this << ", x = " << _x
            << ", y = " << _y << ", signal_strength = " << _signal_strength
            << endl;
    }
};

//main program to call the array for 4 ints and return average
int main() {
    startClock();

    srand ( 213462434L );
    vector<pixel*> pixelVec ;
    // create AND initialize pixels..
    for ( size_t w = 0; w < GRID_WIDTH; w++ )
        for ( size_t h = 0; h < GRID_HEIGHT; h++ )
            pixelVec.push_back( new pixel(w, h, rand()%10) );

    printClock("Time for default new operator(): ");

    // see what the memory looks like..
    pixelVec[0]->print_pixel("after allocation AND initialization");

    // calculate the % of pixels with below par signal strength
    int numPixelsWithLowSignal = 0;
    for ( size_t i = 0; i < pixelVec.size(); i++ )
        if ( pixelVec[i]->_signal_strength < MIN_SIGNAL_STRENGTH )
            ++numPixelsWithLowSignal;

    cout << dec << (numPixelsWithLowSignal * 100 / (GRID_WIDTH*GRID_HEIGHT)) << "% of pixels have below par signal strength." << endl;

    // cleanup..
    for ( size_t i = 0; i < pixelVec.size(); i++ )
        delete pixelVec[i];
    pixelVec.clear();

    printClock("Total time including delete: ");
    return 0 ;
}

/*
OUTPUT:
Time for default new operator(): 9937487 micro seconds
after allocation AND initialization
        ptr = 0x3e3f10, x = 0, y = 0, signal_strength = 4
29% of pixels have below par signal strength.
Total time including delete: 20218721 micro seconds
*/

As we can see the basic difference between Solution 1 & 2 is that we use operator new, which does both memory allocation and initialization (by making a call to the constructor of the class pixel).
In other words this: new pixel(w, h, rand()%10) is same as: pixel* ptr = malloc(sizeof(pixel)); // allocate ptr->pixel(w, h, rand()%10); // call constructor to initialize which in turn is same as: pixel* ptr = malloc(sizeof(pixel)); ptr->_x = w; ptr->_y = h; ptr->_signal_strength = rand()%10; Solution 3 using C++ placement new operator()

As we learned operator new does 2 things 1) memory allocation 2) initialization via a call to constructor.
C++ provides an overloaded version of new operator called placement new, which leaves job of the memory allocation to user. What this means is that if user knows a better / faster way to allocate memory, let him do so and then call placement new operator to perform the initialization.

// Solution 3 using placement new operator.
#include <Common.h> // attached to article
#define GRID_WIDTH 2000
#define GRID_HEIGHT 2000
#define MIN_SIGNAL_STRENGTH 3

class pixel {
public:
    pixel(int x, int y, int signal_strength) {
        _signal_strength = signal_strength;
        _x = x;
        _y = y;
    }
    int _signal_strength ;
    int _x;
    int _y;

    void print_pixel(const char* msg) const {
        cout << msg << endl << "\tptr = " << hex << this << ", x = " << _x
            << ", y = " << _y << ", signal_strength = " << _signal_strength
            << endl;
    }
};

//main program to call the array for 4 ints and return average
int main() {
    startClock();

    void* pre_allocated_memory = malloc(GRID_WIDTH * GRID_HEIGHT * sizeof(pixel));

    srand ( 213462434L );
    vector<pixel*> pixelVec ;
    // create AND initialize pixels..
    for ( size_t w = 0; w < GRID_WIDTH; w++ )
        for ( size_t h = 0; h < GRID_HEIGHT; h++ ) {
            // pixelVec.push_back( new pixel(w, h, rand()%10) );
            pixelVec.push_back( new (pre_allocated_memory) pixel(w, h, rand()%10) );
            // as we've used the address pointed by pre_allocated_memory
            // make pre_allocated_memory point to next slot
            pre_allocated_memory = (pixel*) pre_allocated_memory + 1;
        }

    printClock("Time for placement new operator(): ");

    // see what the memory looks like..
    pixelVec[0]->print_pixel("after allocation AND initialization");

    // calculate the % of pixels with below par signal strength
    int numPixelsWithLowSignal = 0;
    for ( size_t i = 0; i < pixelVec.size(); i++ )
        if ( pixelVec[i]->_signal_strength < MIN_SIGNAL_STRENGTH )
            ++numPixelsWithLowSignal;

    cout << dec << (numPixelsWithLowSignal * 100 / (GRID_WIDTH*GRID_HEIGHT)) << "% of pixels have below par signal strength." << endl;

    // cleanup..
    pixelVec.clear();
    free(pre_allocated_memory);

    printClock("Total time including delete: ");

    return 0 ;
}

/*
Time for placement new operator(): 218748 micro seconds
after allocation AND initialization
        ptr = 0x4f0020, x = 0, y = 0, signal_strength = 4
29% of pixels have below par signal strength.
Total time including delete: 484370 micro seconds
*/

As described above, here we know that I would need to create GRID_WIDTH * GRID_HEIGHT objects of type pixel. So instead of calling the new operator GRID_WIDTH * GRID_HEIGHT times to perform the memory allocation, we do it once using the good old malloc(). And now we call the placement new instead which uses required number of bytes inside the pre-allocated memory and then calls the constructor of pixel to do initialization.
Do note this pre_allocated_memory = (pixel*) pre_allocated_memory + 1; . This statement moves the pointer forward to ensure that next pixel does not use the same memory.

By using placement new, we replace:
- GRID_WIDTH * GRID_HEIGHT number of memory allocations of sizeof(pixel) bytes each
with
- one memory allocations of GRID_WIDTH * GRID_HEIGHT * sizeof(pixel) bytes

This is what gives us the performance boost*#.
In the given example we get 45 times (or 4500%) of improvement. 218748 micro seconds for placement new VS. 9937487 micro seconds default new.


NOTES:
*# Amount of performance improvement would depend on the complexity of constructor code as well. In this example each call to constructor initializes 3 int variables. If the constructor were more complex the % benefit could reduce. But you can expect a big boost in most cases.
## Since there is no standard size for in int defined by the standard this is not necessarily true. It also ignores the fact that there could be data alignment padding in the struct that could augment the size.
** C++ structs can also have constructors unlike C structs.


---------------------------------------------------------------------------------------

A word on freeing memory allocated using placement new operator delete() does exactly the opposite of what operator new() does. It:
- un-intializes / destroys the object: by calling destructor.
- frees the memory that was used by teh object.

Unlike the operator new() , which has a corresponding operator delete(), the placement new operator does not have a corresponding placement operator delete().
So when using placement new, we must ourselves do these 2 things explicitly.
In the example taken in this example the object in question (pixel) is quite trivial and has no need for user defined destruction. In cases where the object being dealt with requires non-trivial destruction, user must explicitly invoke the destructor. E.g. lets say the each pixel instance held onto some system resource (say a file descriptor) then to release such resource, destructor must be invoked in addition to freeing the buffer (pre_allocated_memory).

An example would look like this:

// cleanup
#include <Common.h> // attached to article
#define GRID_WIDTH 2000
#define GRID_HEIGHT 2000
#define MIN_SIGNAL_STRENGTH 3

class pixel {
    ifstream* pFileStream;
public:
    pixel(int x, int y, int signal_strength) {
        _signal_strength = signal_strength;
        _x = x;
        _y = y;
        pFileStream = new ifstream("somefile");
    }
    ~pixel() {
        if(NULL != pFileStream) {
        	pFileStream->close();
        	pFileStream = NULL;
        }
    }

    void print_pixel(const char* msg) const {
        cout << msg << endl << "\tptr = " << hex << this << ", x = " << _x
            << ", y = " << _y << ", signal_strength = " << _signal_strength
            << endl;
    }

public:
    int _signal_strength ;
    int _x;
    int _y;

};

//main program to call the array for 4 ints and return average
int main() {
    startClock();

    void* pre_allocated_memory = malloc(GRID_WIDTH * GRID_HEIGHT * sizeof(pixel));

    srand ( 213462434L );
    vector<pixel*> pixelVec ;
    // create AND initialize pixels..
    for ( size_t w = 0; w < GRID_WIDTH; w++ )
        for ( size_t h = 0; h < GRID_HEIGHT; h++ ) {
            // pixelVec.push_back( new pixel(w, h, rand()%10) );
            pixelVec.push_back( new (pre_allocated_memory) pixel(w, h, rand()%10) );
            // as we've used the address pointed by pre_allocated_memory
            // make pre_allocated_memory point to next slot
            pre_allocated_memory = (pixel*) pre_allocated_memory + 1;
        }

    printClock("Time for placement new operator(): ");

    // see what the memory looks like..
    pixelVec[0]->print_pixel("after allocation AND initialization");

    // calculate the % of pixels with below par signal strength
    int numPixelsWithLowSignal = 0;
    for ( size_t i = 0; i < pixelVec.size(); i++ )
        if ( pixelVec[i]->_signal_strength < MIN_SIGNAL_STRENGTH )
            ++numPixelsWithLowSignal;

    cout << dec << (numPixelsWithLowSignal * 100 / (GRID_WIDTH*GRID_HEIGHT)) << "% of pixels have below par signal strength." << endl;

    // cleanup..
    // for non-trivial destruction, call the destructor explicitly.
    for ( size_t i = 0; i < pixelVec.size(); i++ )
	pixelVec[i]->~pixel();

    pixelVec.clear();
    free(pre_allocated_memory);

    printClock("Total time including delete: ");

    return 0 ;
}

A word padding and alignment
Placement new does not take any special care of padding or data structure alignment.
Given that sizeof() operator considers the padding while computing the size of given type, it is safe to assume that:

// this loop...
for ( size_t h = 0; h < NUM_OBJECTS; h++ ) {
	new (pre_allocated_memory) my_type();
	pre_allocated_memory = (my_type*) pre_allocated_memory + 1;
}

// ...will NOT exhaust the pre_allocated_memory as long as
// pre_allocated_memory is allocated using sizeof(). e.g.
void* pre_allocated_memory = malloc(NUM_OBJECTS * sizeof(my_type));

In other words, do not use magic numbers that you comeup with as size of your type. Trust that job to sizeof(). In the example as pixel contains 3 int variables, you could say malloc(NUM_PIXELS * 12); but DON'T.

Also see this for more info on data alignment and portability.

Edited 5 Years Ago by Narue: Updated per author's request

Comments
Nice post... but...
Nice.
Attachments
#pragma once

#include <cstdio>
#include <iostream>
#include <vector>
//#include <map>
//#include <algorithm>
#include <cstdlib>

using namespace std;

// Time related macros
#include <sys/time.h>
namespace ___kashyap_ns {
    double getTimeInMicroSecs() {
        struct timeval tv;
        gettimeofday(&tv, NULL);
        return tv.tv_usec + 10e6 * tv.tv_sec;
    }
}

#define __DEFINE_CLK double ___start_time = 0
#define __START_CLK ___start_time = ___kashyap_ns::getTimeInMicroSecs()
#define __PRINT_CLK(string) cout << string << (___kashyap_ns::getTimeInMicroSecs() - ___start_time) << " micro seconds" << endl

While I think that using the placement new operator in the appropriate place is a powerful tool, I think that your example takes away from any of that.
Most of your example deals with the mechanics of defining and construction your pixel matrices. Only a very small portion of the code uses memory allocation - which is the purported reason for the post.

In addition to that, I think it is most critical that you discuss the memory release details of using placement new as it is substantially different from how you normally release memory. Ignoring the fact that you do absolutely no free/delete for any of the data you allocate, calling regular operator delete on the location provided to placement new is inappropriate. Consider the following:

#include <iostream>
#include <new>

struct Foo {
    Foo() { std::cout << "Foo()  : " << (void *)this << std::endl; }
    ~Foo () { std::cout << "~Foo() : " << (void *)this << std::endl; }
};

int main () {
    char buffer[1024] = {0};
    Foo * foo = new Foo();
    Foo * bfoo = new (buffer) Foo(); // placement new
    delete foo;
    delete bfoo; // Uh oh...
    return 0;
}

That second delete is wrong; the memory it is referenceing wasn't even allocated on the heap! You need to make provisions to call the destructor in this case without having the memory reclaimed. Instead, the above code snippet would need to look something like the following:

char buffer[1024] = {0};
    Foo * foo = new Foo();
    Foo * bfoo = new (buffer) Foo(); // placement new
    delete foo;
    bfoo->Foo::~Foo(); // Ok, buffer not reclaimed

Without discussing these details the essence of a tool such as placement new is lost.

Edited 5 Years Ago by L7Sqr: n/a

Nice explanation, but I do have a few things to point out.

First, a few details. You make extensive use of variable names, defines and identifiers which start with an underscore character. This is explicitly forbidden by the C++ standard. All identifiers starting with one or two underscore characters are reserved for the compiler vendors for use in their implementation (of either the standard libraries or language extensions, or both). Most of the times your code will work anyways, but you should learn to change your habits in that regard if you wish to write portable, standard-compliant code.

Second, I find it a bit bizarre that you make this whole explanation about initialization / allocation / construction, and yet, in your pixel constructor, you initialize the data members in the body of the constructor as opposed to the initialization list. This poor choice will lead to your constructor taking roughly twice the execution time that it could. So, here is a more appropriate pixel class:

class pixel {
public:
    pixel() { } //provide a default constructor.
    pixel(int aX, int aY, int aSignalStrength) : x(aX), y(aY), signal_strength(aSignalStrength) { }
    int signal_strength;
    int x;
    int y;

    void print_pixel(const char* msg) {
        printf("%s\n\tptr = 0x%x, x = %d, y = %d, signal_strength = %d\n", msg, this, x, y, signal_strength);
    }
};

Now for the meat of it. I think that you missed a very important piece here (probably should be the fourth iteration of your example). The concept that you are talking about (pre-allocating, and placement new) is built into all STL containers via the "hidden" template argument which is the allocator. An allocator (such as std::allocator<T>) is exactly the component you use to implement this pattern that you exposed. In fact, it is most likely that, in this case, the default allocator (std::allocator), which uses placement new, is exactly what is needed in this case, and its inner-workings are encapsulated, so you don't even have to worry about it, but here is an example that explicitly pre-allocates with reserve().

//main program to call the array for 4 ints and return average
int main() {
    __DEFINE_CLK;
    __START_CLK;

    srand ( 213462434L );
    vector<pixel> pixelVec; //store by value

    pixelVec.reserve( WIDTH * HEIGHT ); //pre-allocate.

    // create AND initialize pixels..
    for ( int w = 0; w < WIDTH; w++ )
        for ( int h = 0; h < HEIGHT; h++ )
            // Use stack-based temporary for new pixel:
            pixelVec.push_back( pixel(w, h, rand()%10) );

    __PRINT_CLK("Time for placement new operator(): ");

    // see what the memory looks like..
    pixelVec[0].print_pixel("after allocation AND initialization");

    // calculate the % of pixels with below par signal strength
    int numPixelsWithLowSignal = 0;
    for ( int i = 0; i < pixelVec.size(); i++ )
        if ( pixelVec[i].signal_strength < MIN_SIGNAL_STRENGTH )
            ++numPixelsWithLowSignal;

    cout << numPixelsWithLowSignal * 100 / pixelVec.size() << "% of pixels have below par signal strength." << endl;

    return 0 ;
}

But, if you insist on making a custom allocator, you can also make the initialization even more direct, in which case you will be using placement new, of course. Then, something like this can be done:

class my_pixel_allocator : private std::allocator<pixel> {
  private:
    int next_x;
    int next_y;
  public:
    //provide the required no-throw constructors / destructors:
    my_pixel_allocator() throw() : std::allocator<pixel>(), next_x(0), next_y(0) { };
    my_pixel_allocator(const my_pixel_allocator& rhs) throw() : std::allocator<pixel>(rhs), next_x(rhs.next_x), next_y(rhs.next_y) { };
    template <typename U>
    my_pixel_allocator(const std::allocator<U>& rhs) throw() : std::allocator<pixel>(rhs), next_x(0), next_y(0) { };
    ~my_pixel_allocator() throw() { };
 
    //import the required typedefs:
    typedef std::allocator<pixel>::value_type value_type;
    typedef std::allocator<pixel>::pointer pointer;
    typedef std::allocator<pixel>::reference reference;
    typedef std::allocator<pixel>::const_pointer const_pointer;
    typedef std::allocator<pixel>::const_reference const_reference;
    typedef std::allocator<pixel>::size_type size_type;
    typedef std::allocator<pixel>::difference_type difference_type;

    //import all the member functions of std::allocator that can simply be reused:
    using std::allocator<pixel>::address;
    using std::allocator<pixel>::allocate:
    using std::allocator<pixel>::deallocate;
    using std::allocator<pixel>::max_size;
    using std::allocator<pixel>::destroy;

    //redefine the construct function:
    void construct( pointer p, const_reference ) {
      new ((void*)p) pixel(next_x++, next_y++, rand() % 10);
      next_x %= WIDTH;
      next_y %= HEIGHT;
    };
};

//The main function then becomes:
int main() {
    __DEFINE_CLK;
    __START_CLK;

    srand ( 213462434L );
    typedef vector<pixel,my_pixel_allocator> pix_vector;

    pix_vector pixelVec(WIDTH * HEIGHT); //store by value, allocate, and initialize.

    __PRINT_CLK("Time for placement new operator(): ");

    // see what the memory looks like..
    pixelVec[0].print_pixel("after allocation AND initialization");

    // calculate the % of pixels with below par signal strength
    int numPixelsWithLowSignal = 0;
    for( pix_vector::iterator it = pixelVec.begin(); it != pixelVec.end(); ++it )
        if ( it->signal_strength < MIN_SIGNAL_STRENGTH )
            ++numPixelsWithLowSignal;

    cout << numPixelsWithLowSignal * 100 / pixelVec.size() << "% of pixels have below par signal strength." << endl;

    return 0 ;
}

Final note, about your little Common.h file, I don't think it is good practice to do that. First, the headers included are not seen by the programmer of the main program. Second, the few defines you have in there are trivial. Third, you are importing the std namespace, that's a big NO NO! Don't import namespaces inside a header... just don't!

All identifiers starting with one or two underscore characters are reserved for the compiler vendors for use in their implementation (of either the standard libraries or language extensions, or both).

Just a minor nit, since I feel the need to defend my own naming convention. ;) What you've said is close enough that "don't use leading underscores" is a valid guideline. The actual rules are less strict, but more subtle:

  • Two leading underscores or a leading underscore followed by an upper case letter is reserved in all cases.
  • One leading underscore followed by a lower case letter or digit is only reserved in the global namespace.

So while names such as __DEFINE_CLK and __START_CLK are invading the implementation's reserved identifier space regardless of where they appear, data members like _x and _y are just fine.

Well, you are right Narue. I knew about those subtleties, but I generally apply the simple "don't use leading underscores" rule anyways.

>>data members like _x and _y are just fine.

They are, sure. But what about, if later you decide to eliminate the data member, or change its name, if you rely on the compiler to point out the places in your implementation where you use that data member that no longer exists, it might remain silent, assume you are referring to a global identifier of the same name, and you could get a nasty, silent bug on your hands. I admit that this is a bit far-fetched.

Don't get me wrong, there are plenty of people using leading underscores for this and that in their naming conventions. And where it's safe w.r.t. the C++ standard rules, and working, it's definitely not worth refactoring all the code to fix such a minor detail (and risking introducing bugs in the process). But, it's probably better to get used to another naming convention, if you are still at that point in the learning or have the liberty to do so (e.g. you don't have to use the already established naming convention of your project).

Thanks for all the comments.

@tonyjv

... and the fly-weight pattern?

My apologies, when I started I wanted to continue the article to include the fly-weight pattern as well, but at the end it was 530pm (time to leave) and the article was already long enough so I decided to write a separate one. But then I forgot to remove the mention from the header.

@L7Sqr

In addition to that, I think it is most critical that you discuss the memory release details of using placement new as it is substantially different from how you normally release memory. Ignoring the fact that you do absolutely no free/delete for any of the data you allocate, calling regular operator delete on the location provided to placement new is inappropriate...

Completely agree. Let me see how can I add it.


@mike_2000_17

Second, I find it a bit bizarre that you make this whole explanation about initialization / allocation / construction, and yet, in your pixel constructor, you initialize the data members in the body of the constructor as opposed to the initialization list. This poor choice will lead to your constructor taking roughly twice the execution time that it could. So, here is a more appropriate pixel class:

At first let me say I agree that initializers are a better for initialization.
This is how I wrote the code at first (in fact the members were all declared to be const) but for someone who is new to C++ initializers are not as readable as what's put in there. Especially given that I was comparing it line-by-line to the original C syntax.
In any case the idea behind the article is not to teach initialization of members in C++ syntax. The message to take away is that "new calls constructor to initialize the members".

Now for the meat of it. I think that you missed a very important piece here (probably should be the fourth iteration of your example). The concept that you are talking about (pre-allocating, and placement new) is built into all STL containers via the "hidden" template argument which is the allocator. An allocator (such as std::allocator<T>) is exactly the component you use to implement this pattern that you exposed. In fact, it is most likely that, in this case, the default allocator (std::allocator), which uses placement new, is exactly what is needed in this case, and its inner-workings are encapsulated, so you don't even have to worry about it, but here is an example that explicitly pre-allocates with reserve().

I do hope you would believe me when I say that I had planned a fourth iteration, and planned it exactly for the allocator explanation. But finally I decided the allocators do not go with teh title of this post (placement new). They have a wider application (e.g. fly-weight patter).
So I would be writing another one of these which shows how to use allocator + fly-weight-pattern along with placement new.

Final note, about your little Common.h file, I don't think it is good practice to do that. First, the headers included are not seen by the programmer of the main program. Second, the few defines you have in there are trivial. Third, you are importing the std namespace, that's a big NO NO! Don't import namespaces inside a header... just don't!

I KNOW! The reason for putting it in a separate header is purely to
- reduce the number of lines I have post in each CODE tag
- keep reader's focus on the meat
Though I do take you comment on using std; . I was just lazy.. :)

pixelVec.reserve( WIDTH * HEIGHT ); //pre-allocate.

Thanks. It's been some 4-5 years since I coded in C++ (that's the reason I'm wrote this in the first place :) ). I was looking for a vector::constructor that took initial capacity as argument and didn't find one. :)

@Narue

So while names such as __DEFINE_CLK and __START_CLK are invading the implementation's reserved identifier space regardless of where they appear, data members like _x and _y are just fine.

_*_CLK are #defines (not variables I mean). Do they still pause a problem?

Edited 5 Years Ago by thekashyap: n/a

>>_*_CLK are #defines (not variables I mean). Do they still pause a problem?

Of course. What if something else (anywhere) in the standard libraries' implementation from any compiler-vendor you might want to compile this code with has the same name? After all, these names are reserved for compiler-vendors (or OS implementers), they are allowed to use them at will. Then, what will happen is that the preprocessor will sub-in your MACRO's definition in all those places where that item is found, and then you will get extremely weird, obscure, MACRO-expanded errors in the standard libraries' code (compilation errors) and you will probably plow your way through Hell to find what is causing the errors. It's definitely worse if you have #defines with these leading underscores than if you have variable names (because at least, variable names respect scopes, #defines don't).

>>I do hope you would believe me when I say that I had planned a fourth iteration

I do believe you. You had a great post and a great explanation, I found it rather bizarre that you didn't go that one extra step and mention the allocator (which essentially encapsulates the placement new operator). Don't get me wrong, most of my comments on your post were about minor details (except for the importing of a namespace in a header, that's a pretty big deal).

Edited 5 Years Ago by mike_2000_17: n/a

Thanks to moderators for updating.
@everyone who commented, the original post is now updated with all corrections.

New Common.h is here:

#ifndef MY_COMMON_HEADER_FOR_PLACEMENT_NEW_EXAMPLE
#define MY_COMMON_HEADER_FOR_PLACEMENT_NEW_EXAMPLE

#include <cstdio>
#include <cstdlib>

#include <iostream>
using std::cin;
using std::cout;
using std::flush;
using std::endl;
using std::hex;
using std::dec;
using std::fixed;

#include <algorithm>
using std::for_each;

#include <vector>
using std::vector;

#include <string>
using std::string;

#include <sys/time.h>
namespace {
    double m___start_time_asdfsa = 0;
    double getTimeInMicroSecs() {
        struct timeval tv;
        gettimeofday(&tv, NULL);
        return tv.tv_usec + 10e6 * tv.tv_sec;
    }

    inline void printClock( const char* msg ) {
        cout << msg << (long)(getTimeInMicroSecs() - m___start_time_asdfsa) << " micro seconds" << endl;
    }
    inline void startClock() {
        m___start_time_asdfsa = getTimeInMicroSecs();
    }
}

#endif
The article starter has earned a lot of community kudos, and such articles offer a bounty for quality replies.