I am trying to understand COM memory layout and using C++ classes just confuses me. In general I understand what is going on with the double indirection of function pointers but as so often is the case in programming one must understand the ugly details – not just the generalities. So I thought I would create a very simple example and attempt to call a few functions contained in a couple interfaces without using typical C++ object call syntax but just using C style function pointers and addresses I could dig/pry/coax out of a C++ class. I’ve been largely successful in this endeavor I think but I am having some difficulties that I can’t explain. I would certainly appreciate someone’s looking at this who is knowledgable in these matters (and who has perhaps waded through this stuff him/herself!). For others just learning COM there may be some merit in trying to follow what I am doing here.

Here is what I’ve done. I created two interfaces IX and IY that each inherit from Iunknown and each contain just two functions, e.g., Fx1(), Fx2(), Fy1(), and Fy2(). They are pure virtual functions within these two interfaces and are implemented in class CA where they are publically inherited. Class CA contains nothing else to complicate matters. The implementation of these functions in class CA just outputs a message that they were called. Here is the very simple construct…

interface IX : IUnknown              //will cause inheritance of
{                                    //QueryInterface(),AddRef(),
 virtual void __stdcall Fx1()=0;     //and Release().
 virtual void __stdcall Fx2()=0;
};

Interface IY : IUnknown
{
 virtual void __stdcall Fy1()=0;
 virtual void __stdcall Fy2()=0;
};

class CA : public IX, public IY      //publically inherit interfaces IX and IY
{
 public:
 virtual HRESULT __stdcall QueryInterface(const IID& iid, void** ppv)
 {
  puts("\nCalled QueryInterface()");
  return S_OK;
 }

 virtual ULONG   __stdcall AddRef()
 {
  puts("Called AddRef()");
  return 0;
 }

 virtual ULONG   __stdcall Release()
 {
  puts("Called Release()");
  return 0;
 }

 virtual void    __stdcall Fx1(){printf("Called Fx1()\n");}
 virtual void    __stdcall Fx2(){printf("Called Fx2()\n");}
 virtual void    __stdcall Fy1(){printf("Called Fy1()\n");}
 virtual void    __stdcall Fy2(){printf("Called Fy2()\n");}
};

As you can see QueryInterface(), AddRef(), and Release() were also implemented to just make the class instantiable. From my reading of an article by Jeff Glatt “COM In Plain C” which I believe was pretty widely read on the net…

http://www.codeproject.com/KB/COM/com_in_c1.aspx

Jeff makes the statement that “a C++ class is really nothing more than a struct whose first member is always a pointer to an array – an array that contains pointers to all the functions inside that class”. This idea reaffirmed a post I discovered here at daniwem.com from March 10, 2008 by Asadullah at…

http://www.daniweb.com/forums/showthread.php?p=556791&highlight=VTABLE#post556791

Where Asadullah Ansari methodically disects classes and vtables to call virtual functions just with C addresses and some rather nasty casting. Without Asadullah’s invaluable post I’m not sure I would have made the progress with this that I have because the casting becomes really, really ugly, to put it mildly.

So that’s what I’m attempting to do here. Its my hope that this exercise will further cement my understanding of COM memory layout. Here goes. In my main() function the first thing I do is declare a pointer to class CA and use it to create an new instance of CA as follows…

pCA=new CA;
printf("sizeof(CA)            = %u\t : An IX VTBL Ptr And A IY VTBL Ptr\n",  sizeof(CA));
printf("pCA                   = %u\t : Ptr To IX VTBL\n", (unsigned int)pCA);
printf("*(int*)(&pCA)         = %u\t : Same Thing With Just More Involved Notation.\n",  *(int*)(&pCA));
printf("*(int*)*(int*)(&pCA)  = %u\t : Should Point To IX::QueryInterface()???\n",  *(int*)*(int*)(&pCA));

The output from those statements are as follows…

sizeof(CA)            = 8       : An IX VTBL Ptr And A IY VTBL Ptr
pCA                   = 4144888 : Ptr To IX VTBL
*(int*)(&pCA)         = 4144888 : Same Thing With Just More Convoluted Notation.
*(int*)*(int*)(&pCA)  = 4224168 : Should Point To IX::QueryInterface()???

So far so good, I think! My understanding of class construction is that there should be a pointer to a vtable for each interface that is being inherited. Here, since we are inheriting both IX and IY that would make two pointers or 8 bytes. In the program I also declared an int* named dwPtr that I set equal to pCA to allow me to use base pointer offset notation to point to each of the two vtables as needed, i.e.,

int* dwPtr=0;

dwPtr=(int*)pCA;

So, to point sequentially to each of the vtables this would work…

For(i=0; i<2; i++)
    printf(“%u\t%u\n”, i, &dwPtr[I]);

// Output:
//
// i    &dwPtr[I]
// ==========================================
// 1	4144888    points to IX VTable
// 2	4144892    points to IY Vtable

Since we’ve located the respective Vtables in memory it should be thoretically possible to call the IX and IY interface functions through their function addresses within each Vtable. To that end I created a simple typedef to make the function pointer calls in the Vtable easier…

typedef void (*FN)(void);

One last issue before I present main() and the whole program together (its only 78 lines counting spaces) is that since interfaces IX and IY both themselves inherit the memory layout of Iunknown, the first three functions in each Vtable will be QueryInterface(), AddRef(), and Release(). Here is the whole program, and the output directly follows…
(sorry about the formatting. I struggled to make it fit better here but it just runs too wide. You'll have to 'Toggle Plain Text' or better yet post in a code editor)

#include <stdio.h>                   //for printf(), puts(), etc.
#include <objbase.h>                 //for typedef struct interface
typedef void (*FN)(void);            //to ease use of function pointers

interface IX : IUnknown              //will cause inheritance of
{                                    //QueryInterface(),AddRef(),
 virtual void __stdcall Fx1()=0;     //and Release().
 virtual void __stdcall Fx2()=0;
};


interface IY : IUnknown
{
 virtual void __stdcall Fy1()=0;
 virtual void __stdcall Fy2()=0;
};


class CA : public IX, public IY      //will inherit pure abstract base
{                                    //classes IX and IY.
 public:
 virtual HRESULT __stdcall QueryInterface(const IID& iid, void** ppv)
 {
  puts("\nCalled QueryInterface()"); //QueryInterface(), AddRef()
  return S_OK;                       //and Release() from IUnknown
 }                                   //will be implemented here as
                                     //well as Fx1(), Fx2(), Fy1(),
 virtual ULONG   __stdcall AddRef()  //and Fy2() so that the class
 {                                   //can be instantiated.
  puts("Called AddRef()");
  return 0;
 }

 virtual ULONG   __stdcall Release()
 {
  puts("Called Release()");
  return 0;
 }

 virtual void    __stdcall Fx1(){printf("Called Fx1()\n");}  //implementations
 virtual void    __stdcall Fx2(){printf("Called Fx2()\n");}  //of inherited
 virtual void    __stdcall Fy1(){printf("Called Fy1()\n");}  //pure virtual
 virtual void    __stdcall Fy2(){printf("Called Fy2()\n");}  //functions.
};


int main(void)
{
 unsigned int i;
 int* dwPtr=0;
 CA* pCA=0;
 FN  pFn=0;

 pCA=new CA;
 printf("sizeof(CA)            = %u\t : An IX VTBL Ptr And A IY VTBL Ptr\n",sizeof(CA));
 printf("pCA                   = %u\t : Ptr To IX VTBL\n",(unsigned int)pCA);
 printf("*(int*)(&pCA)         = %u\t : Same Thing With Just More Involved Notation.\n",*(int*)(&pCA));
 printf("*(int*)*(int*)(&pCA)  = %u\t : Should Point To IX::QueryInterface()???\n",*(int*)*(int*)(&pCA));
 dwPtr=(int*)pCA;
 printf("dwPtr = %u\n",(unsigned int)dwPtr);
 for(i=0;i<2;i++)
 {
     pFn=(FN)*((int*)*(&dwPtr[i])+0);//QueryInt  @offset 0 in ith vtbl
     pFn();
     pFn=(FN)*((int*)*(&dwPtr[i])+1);//AddRef()  @offset 1 in ith vtbl
     pFn();
     pFn=(FN)*((int*)*(&dwPtr[i])+2);//Release() @offset 2 in ith vtbl
     pFn();
     pFn=(FN)*((int*)*(&dwPtr[i])+3);//Fx1,Fy1   @offset 3 in ith vtbl
     pFn();
     pFn=(FN)*((int*)*(&dwPtr[i])+4);//Fx2,Fy2   @offset 4 in ith vtbl
     pFn();
 }
 delete pCA;
 getchar();

 return 0;
}

And here is the output…

/*
sizeof(CA)            = 8       : An IX VTBL Ptr And A IY VTBL Ptr
pCA                   = 4144888 : Ptr To IX VTBL
*(int*)(&pCA)         = 4144888 : Same Thing With Just More Involved Notation.
*(int*)*(int*)(&pCA)  = 4224168 : Should Point To IX::QueryInterface()???
dwPtr = 4144888

Called QueryInterface()
Called AddRef()
Called Release()
Called Fx1()
Called Fx2()

Called QueryInterface()
Called AddRef()
Called Release()
Called Fy1()
Called Fy2()
*/

Now for the questions and problems. I have three different C++ compiler / development environments with which I can test this program. Using Microsoft’s Visual C++ 6 the program seems to work flawlessly and produces the above output. Using the new Code::Blocks IDE version 8.02 I just installed a few weeks ago Which uses the GNU gcc compiler suite the program produces the above output exactly the same as the MS VC++ package, but when I hit the [ENTER] key to get through the getchar() at program termination I get a GPF. I also have the Dev C++ 4.9.9.2 Bloodshed Suite which uses another separate installation of the GNU gcc compiler suite and this is GPF’ing after the “Called Fy1()” line. So something is wrong but I’m not sure what. My guess would be somewhere in the bizarre casting that has to take place to be able to iterate through pointers that are no longer pointing to CA objects but to int* in the vtables.

I’m aware that my typedef void (*FN)(void) doesn’t accurately reflect the QueryInterface(), AddRef(), and Release() function returns and signatures, but adding and using modified typedefs for these doesn’t change the results at all, so I don’t believe that is the problem.

My concern is that I’m failing to understand the memory layout, but if my understanding was poor and I’m laboring under major misconceptions I don’t think I would have gotten as far as I did. I really need to understand this because I’d like to be able to create COM components in other languages besides C++ and for that I need to completely understand this.

This business of the 1st members of a class being pointers to one or multiple VTABLES – is that a standardized C++ construction or just one implemented by the MS and GNU compilers I’ve used? It kind of seems pretty standard to me.

If anyone has any comments on what I’ve done or corrections I’d be glad. Again, I realize it would be easier to just do this – pCA->Fx1() – but that’s not the point. I’d particularly like to know why the crashes at program termination with some of the compilers.

If Asadullah Ansari is listening I’d especially like to thank him for his posts of March 10, 2008 where his coverage of some of these issues allowed me to make what progress I have managed with it here. Certainly, every post isn’t equally useful to everyone.

...If you haven't already, I strongly suggest that you read Inside The C++ Object Model (by Stanley B. Lippman).

Allow me reformulate two well-known axioms:
1. COM is a LANGUAGE-NEUTRAL specification "of implementing objects that can be used in environments different from the one they were created in, even across machine boundaries" (Wikipedia).
2. The vtbl (or vtable) concept Is NOT a part of the C++ language specification. It's the most popular mechanism of dynamic dispatch implementation (not only in C++ compilers), but it's not a part of C++ class object binary layout specification (no such animal in the C++ Universe).

There are tons of papers on these (totally different) topics (but I have never seen top secret manuals about COM or vtables).

It seems you try to discover terra incognita where COM interface==C++ class object layout.
But this triad: COM, C++ and vtables live in different namespaces...

Yes, MS used the same Windows COM interface binary layout as MS C++ use (for abstract class successors). It's a natural way to go (for MS). That's all. Is it some kind of esoteric knowledges?

See, for example:
http://www.codeproject.com/KB/COM/comintro.aspx
http://en.wikipedia.org/wiki/Component_Object_Model
http://en.wikipedia.org/wiki/Virtual_table

Thanks for the links ArtM. I've saved them and will definitely print out the one. In terms of the Stanley Lippman book "The C++ Object Model", I've had that for some time. Unfortunately, I have not yet attained that exalted state of C++ competance where its reading is in any way benificial to me. What would be especially benificial to me in my present mentally depaupeurate state would be knowledge of what a process return value of 0xC0000005 means. For you see, that is what I am getting from the new Code::Blocks created program I am running. My Microsoft Visual C++ program builds and runs perfectly, but with any GNU compiler product I get various problems, and I'd like to know what sort of havoc I'm causing. Below is the output from the Code::Blocks compiled program. It runs and produces perfect output until it closes out with a crash...

sizeof(CA)            = 8        : An IX VTBL Ptr And A IY VTBL Ptr
pCA                   = 3146576  : Ptr To IX VTBL
*(int*)(&pCA)         = 3146576  : With Just More Convoluted Notation.
*(int*)*(int*)(&pCA)  = 4219068  : Ptr to Ptr For IX::QueryInterface()
pVTbl                 = 3146576

&pVTbl[i]       &VTbl[j]        pFn=VTbl[j]     pFn()
=========================================================================
3146576         4219068         4198656         Called QueryInterface()
3146576         4219072         4198688         Called AddRef()
3146576         4219076         4198720         Called Release()
3146576         4219080         4198752         Called Fx1()
3146576         4219084         4198768         Called Fx2()

3146580         4219048         4198816         Called QueryInterface()
3146580         4219052         4198832         Called AddRef()
3146580         4219056         4198848         Called Release()
3146580         4219060         4198784         Called Fy1()
3146580         4219064         4198800         Called Fy2()

Process returned -1073741819 (0xC0000005)   execution time : 11.636 s
Press any key to continue.

Here is the updated program which, as I've said, runs perfectly with VC++, but not with GNU. Its only 85 lines and heavily commented. I really would appreciate if someone would take a quick look at it. I do want to learn this stuff, and would like to know if I'm doing anything very bad, moderately bad, or even slightly ill advised...

#include <stdio.h>                   //for printf(), puts(), etc.
#include <objbase.h>                 //for typedef struct interface
typedef void (*FN)(void);            //to ease use of function pointers

interface IX : IUnknown              //will cause inheritance of
{                                    //QueryInterface(),AddRef(),
 virtual void __stdcall Fx1()=0;     //and Release().  Note that when
 virtual void __stdcall Fx2()=0;     //class CA (just below) is
};                                   //instantiated, these two functions
                                     

interface IY : IUnknown              //of IX and two functions of IY
{                                    //will be implemented, and a memory
 virtual void __stdcall Fy1()=0;     //allocation will be made for a VTABLE
 virtual void __stdcall Fy2()=0;     //which will contain function pointers
};                                   //to these interface functions.


class CA : public IX, public IY      //will inherit pure abstract base
{                                    //classes IX and IY.
 public:
 virtual HRESULT __stdcall QueryInterface(const IID& iid, void** ppv)
 {
  puts("Called QueryInterface()");   //QueryInterface(), AddRef()
  return S_OK;                       //and Release() from IUnknown
 }                                   //will be implemented here as
                                     //well as Fx1(), Fx2(), Fy1(),
 virtual ULONG   __stdcall AddRef()  //and Fy2() so that the class
 {                                   //can be instantiated.
  puts("Called AddRef()");
  return 0;
 }

 virtual ULONG   __stdcall Release()
 {
  puts("Called Release()");
  return 0;
 }

 virtual void    __stdcall Fx1(){printf("Called Fx1()\n");} //implementations
 virtual void    __stdcall Fx2(){printf("Called Fx2()\n");} //of inherited
 virtual void    __stdcall Fy1(){printf("Called Fy1()\n");} //pure virtual
 virtual void    __stdcall Fy2(){printf("Called Fy2()\n");} //functions.
};


int main(void)
{
 DWORD* pVTbl=0; //will hold pointer to vtable (there are two - IX, IY vtables)
 DWORD* VTbl=0;  //address of vtable, i.e., pVTbl
 CA* pCA=0;      //pointer to CA instance
 FN  pFn=0;      //function pointer for calling interface functions through address
 DWORD i,j;      //for loop iterators

 pCA=new CA;     //create new instance of CA
 printf("sizeof(CA)               = %u\t\t : An IX VTBL Ptr And A IY VTBL Ptr\n",sizeof(CA));
 printf("pCA                      = %u\t : Ptr To IX VTBL\n",pCA);
 printf("*(DWORD*)(&pCA)          = %u\t : Same Thing With Just More Convoluted Notation.\n",*(int*)(&pCA));
 printf("*(DWORD*)*(DWORD*)(&pCA) = %u\t : Pointer to Pointer For IX::QueryInterface()\n",*(int*)*(int*)(&pCA));
 pVTbl=(DWORD*)pCA;
 printf("pVTbl                    = %u\n\n",(DWORD)pVTbl);
 printf("&pVTbl[i]\t&VTbl[j]\tpFn=VTbl[j]\tpFn()\n");
 printf("=========================================================================\n");
 for(i=0;i<2;i++)               //Two VTABLES!  Iterate Through Them.  The 1st VTABLE Pointer Occupies Bytes
 {                              //0 - 3 Offset From pCA, i.e., here 3146576 - 3146579.  The 2nd VTABLE Ptr
     VTbl=(DWORD*)pVTbl[i];     //is at Offset Bytes 4 - 7, i.e.,3146580 - 3146583.  The 'j' loop iterates
     for(j=0;j<5;j++)           //through each VTABLE calling the five functions contained in each interface.
     {                          //The VTABLEs contain pointers to the respective interface functions.
         printf
         (
          "%u\t\t%u\t\t%u\t\t", //The expression &pVTbl[i] will resolve to an address which contains a 
          &pVTbl[i],            //pointer to one of the two possible VTABLEs created when class CA was
          &VTbl[j],             //instantiated.  In this program run address 3146576 contains a pointer 
          VTbl[j]               //to the IX VTable, and when i=1 address 3146580 contains a pointer to
         );                     //the IY VTABLE.  Note that the variable VTbl is also an intger/DWORD
         pFn=(FN)VTbl[j];       //pointer, so, when we store the base address of either of the VTABLEs 
         pFn();                 //in VTbl, incrementing it through use of base pointer notation it will
     }                          //make available the next function pointer stored in the VTABLE.  The
     printf("\n");              //pointer array VTbl[] contains function pointers to the respective
 }                              //interface functions.  With C or C++, a typedef'ed function pointer
 delete pCA;                    //variable - here FN pFn - is usually created to take a function pointer
 getchar();                     //address to create a usable function pointer call.

 return 0;
}
This article has been dead for over six months. Start a new discussion instead.