Alignement

Please support our C++ advertiser: Intel Parallel Studio Home
Thread Solved

Join Date: Nov 2006
Posts: 202
Reputation: n.aggel is an unknown quantity at this point 
Solved Threads: 11
n.aggel's Avatar
n.aggel n.aggel is offline Offline
Posting Whiz in Training

Alignement

 
0
  #1
Jul 16th, 2008
Hi guys, i have one question regarding alignment. Assume that you have the following struct:

  1. struct align1
  2. {
  3. double a;
  4. char b;
  5. int c;
  6. short d;
  7. };

Also assume:
sizeof(double): 8
sizeof(int): 4
sizeof(char): 1
sizeof(short): 2

i would expect:
sizeof(align1): 8 + (char padded to->) 4 + 4 + 2 = 18 bytes
but i get:
sizeof(align1): 8 + (char padded to->) 4 + 4 + (short padded to->) 4 = 20 bytes

So what i don't understand is why the compiler pads the structure in the end? Theoretically the compiler could occupy the 2 wasted bytes
of the struct with 2 chars, why waste them on padding?

To make the question more clear if we had:
  1. struct align1 var1;
  2. char a;
  3. char b;

we would need to allocate only 20bytes; but with the strategy of the compiler we would need 24 bytes. So what i want to learn is
why gcc is making this choice.

The program to test all this follows:

  1. #include <stdio.h>
  2.  
  3. struct align1
  4. {
  5. double a;
  6. char b;
  7. int c;
  8. short d;
  9. };
  10.  
  11. struct align2
  12. {
  13. double a;
  14. char b;
  15. short d;
  16. int c;
  17. };
  18.  
  19. struct align3
  20. {
  21. double a;
  22. int c;
  23. char b;
  24. short d;
  25. };
  26.  
  27. int main()
  28. {
  29.  
  30.  
  31. printf("sizeof(double): %d\nsizeof(int): %d\nsizeof(char): %d\nsizeof(short): %d\n",
  32. sizeof(double),sizeof(int),sizeof(char),sizeof(short));
  33.  
  34. printf("minimum space required: %d bytes \n", sizeof(double)+sizeof(int)+sizeof(char)+sizeof(short));
  35.  
  36.  
  37. printf("allignment1 costs: %d bytes\n", sizeof(struct align1));
  38. printf("allignment2 costs: %d bytes\n", sizeof(struct align2));
  39. printf("allignment3 costs: %d bytes\n", sizeof(struct align3));
  40.  
  41.  
  42. return 0;
  43. }


I have read the following articles {but i still haven't figured out what i am asking}:
[1] http://en.wikipedia.org/wiki/Packed
[2] http://www-128.ibm.com/developerwork...ary/pa-dalign/
[3] http://msdn.microsoft.com/en-us/library/ms253949.aspx
Last edited by n.aggel; Jul 16th, 2008 at 6:32 pm.
Two roads diverged in a wood, and I— I took the one less traveled by, and that has made all the difference.

by Robert Frost the "The Road Not Taken"
Reply With Quote Quick reply to this message  
Join Date: Nov 2005
Posts: 251
Reputation: dwks has a spectacular aura about dwks has a spectacular aura about 
Solved Threads: 25
dwks's Avatar
dwks dwks is offline Offline
Posting Whiz in Training

Re: Alignement

 
1
  #2
Jul 16th, 2008
So your question is, why does structure alignment apply to the last member of a structure?

I think the answer is the same as for why structure alignment exists in the first place. Structure alignment makes the size of a structure a multiple of the machine's word size, which on your computer is probably 32 bits (4 bytes). This is because it's a lot easier for a computer to read addresses that are aligned at word boundaries.

Consider an array of structures. Padding individual members of the structure isn't much use if you don't pad the last member; the second element in the array won't start at a word boundary.

[edit] In this particular case, the compiler could have made the structure smaller, I suppose. However, the sizeof a structure is always supposed to be the same thing, so your structure would then be virtually useless for arrays. [/edit]
Last edited by dwks; Jul 16th, 2008 at 6:49 pm.
dwk

Seek and ye shall find.

"Only those who will risk going too far can possibly find out how far one can go."
-- TS Eliot.

"I have not failed. I've just found 10,000 ways that won't work."
-- Thomas Alva Edison

"The only real mistake is the one from which we learn nothing."
-- John Powell
Reply With Quote Quick reply to this message  
Join Date: Aug 2005
Posts: 15,398
Reputation: Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute Ancient Dragon has a reputation beyond repute 
Solved Threads: 1466
Team Colleague
Featured Poster
Ancient Dragon's Avatar
Ancient Dragon Ancient Dragon is offline Offline
Still Learning

Re: Alignement

 
0
  #3
Jul 16th, 2008
You can change the compiler's padding behavior by using some options or pragmas. Microsoft compilers can eliminate padding altogether with
  1. #pragma pack(0) // don't pad this structure
  2. typedef struct
  3. {
  4. // blabla
  5. }
  6. #pragma pack() // revert to default padding

Deleting padding is useful when you want to save data in binary form to a file or xmit it across socket and you don't know what is on the other end. It will also help reduce memory requirements in some applications which use large arrays of those structures.
Don't PM me with questions -- you might get a nasty PM in response. If you have a question then post it in one of the forums.
Reply With Quote Quick reply to this message  
Join Date: Dec 2006
Posts: 1,089
Reputation: vijayan121 is a name known to all vijayan121 is a name known to all vijayan121 is a name known to all vijayan121 is a name known to all vijayan121 is a name known to all vijayan121 is a name known to all 
Solved Threads: 164
vijayan121 vijayan121 is offline Offline
Veteran Poster

Re: Alignement

 
0
  #4
Jul 17th, 2008
  1. struct align1
  2. {
  3. double a;
  4. char b;
  5. int c;
  6. short d;
  7. };

> why the compiler pads the structure in the end?

hint:
1. does the struct align1 have an alignment requirement?
2. if we create an array align1 array[5] ; would the alignment be right for every element in the array?
Reply With Quote Quick reply to this message  
Join Date: Jul 2008
Posts: 2,001
Reputation: ArkM has much to be proud of ArkM has much to be proud of ArkM has much to be proud of ArkM has much to be proud of ArkM has much to be proud of ArkM has much to be proud of ArkM has much to be proud of ArkM has much to be proud of ArkM has much to be proud of 
Solved Threads: 343
ArkM's Avatar
ArkM ArkM is offline Offline
Postaholic

Re: Alignement

 
0
  #5
Jul 17th, 2008
Some addition to dwks's post:
if it were not for structure tail padding, one of the most valuable C and C++ identity
  1. Type array[SIZE];
  2. sizeof(array)/sizeof(Type) == SIZE
was incorrect for some kind of Type's (as a structure from the original post).
Reply With Quote Quick reply to this message  
Join Date: Nov 2006
Posts: 202
Reputation: n.aggel is an unknown quantity at this point 
Solved Threads: 11
n.aggel's Avatar
n.aggel n.aggel is offline Offline
Posting Whiz in Training

Re: Alignement

 
0
  #6
Jul 17th, 2008
Thank you all for your answers.

So your question is, why does structure alignment apply to the last member of a structure?
Yes.

hint:
1. does the struct align1 have an alignment requirement?
2. if we create an array align1 array[5] ; would the alignment be right for every element in the array?
I can't understand why pad the structure in the end. Ok if we want an array of align1 elements *then* we have to pad the end. But if don't want an array why the compiler chooses to pad the end?

Take for example the second program where i create a composite struct containing the first struct then why "pay" 24bytes when we can fit the whole composite struct in 20 and at the same time respect the boundary alignment?

  1. pad2end: CFLAGS += -Wpacked -Wpadded
  2. pad2end : pad2end.o

  1. #include <stdio.h>
  2.  
  3. struct align1
  4. {
  5. double a;
  6. char b;
  7. int c;
  8. short d;
  9. };
  10.  
  11. struct alignComposite
  12. {
  13. struct align1 one;
  14. char two;
  15. char three;
  16. };
  17.  
  18.  
  19. int main()
  20. {
  21.  
  22.  
  23. printf("sizeof(align1) : %d bytes\n", sizeof(struct align1));
  24.  
  25. printf("sizeof(alignComposite) : %d bytes\n", sizeof(struct alignComposite));
  26.  
  27. return 0;
  28. }
  29.  
  30. /*
  31. so align1 should occupy: 8+4+4+2:18 without the padding in the end.
  32. and align composite should occupy: 18 + 2 :20
  33.  
  34. but
  35.  
  36. $ ./pad2end
  37. sizeof(align1) : 20 bytes
  38. sizeof(alignComposite) : 24 bytes
  39.  
  40. */

The way i see it from the hints of vijayan and the fact that structures are considered to be scalar variable; and have the same behaviour{plz correct me if i am wrong}. The structs are padded so that no padding occurs when we create an array of them, like with a scalar variable because if the compiler padded the struct when it created the array it wouldn't be the same behaviour we have when we created an array of integers.

Is this notion correct? Can anyone put it more formally?

thanks again for your time,
nicolas

PS: what's the difference between -Wpacked -Wpadded warnings?
Two roads diverged in a wood, and I— I took the one less traveled by, and that has made all the difference.

by Robert Frost the "The Road Not Taken"
Reply With Quote Quick reply to this message  
Join Date: Dec 2005
Posts: 5,850
Reputation: Salem has a reputation beyond repute Salem has a reputation beyond repute Salem has a reputation beyond repute Salem has a reputation beyond repute Salem has a reputation beyond repute Salem has a reputation beyond repute Salem has a reputation beyond repute Salem has a reputation beyond repute Salem has a reputation beyond repute Salem has a reputation beyond repute Salem has a reputation beyond repute 
Solved Threads: 749
Team Colleague
Salem's Avatar
Salem Salem is offline Offline
Void main'ers are DOOMed

Re: Alignement

 
0
  #7
Jul 17th, 2008
> But if don't want an array why the compiler chooses to pad the end?
How is the compiler supposed to know you don't want an array of them?

> why "pay" 24bytes when we can fit the whole composite struct in 20 and at the
> same time respect the boundary alignment?
The compiler isn't allowed to rearrange structure members to achieve the best fit.
If you want to arrange things so that you waste the least space, then arrange as groups of double, float, long, int, short, char.

double, char, double, char will take more space than
double, double, char, char

> Deleting padding is useful when you want to save data in binary form to a file
> or xmit it across socket and you don't know what is on the other end.
But packing in itself is not enough to ensure success. Packing can never solve the endian problem for example. Also, packing increases the complexity (and decreases the performance) of the code accessing the structure. On some machines, reading a packed int may take 4 byte reads and some shifting, compared to a single aligned 32-bit read if the compiler was free to do it's own thing.

Another also, does pack() actually guarantee a minimum, or just a "best effort"?

Plus there's the whole "lack of portability" in specifying such things to the compiler to begin with.
Reply With Quote Quick reply to this message  
Join Date: Dec 2006
Posts: 1,089
Reputation: vijayan121 is a name known to all vijayan121 is a name known to all vijayan121 is a name known to all vijayan121 is a name known to all vijayan121 is a name known to all vijayan121 is a name known to all 
Solved Threads: 164
vijayan121 vijayan121 is offline Offline
Veteran Poster

Re: Alignement

 
1
  #8
Jul 17th, 2008
> The structs are padded so that no padding occurs when we create an array of them,
> like with a scalar variable because if the compiler padded the struct when it created the
> array it wouldn't be the same behaviour we have when we created an array of integers.
> Is this notion correct?

not merely 'it wouldn't be the same behaviour we have when we created an array of integers'. array subscript and pointer arithmetic would not work otherwise.

> an anyone put it more formally?

the formal exposition is in the legalese in the IS (ISO/IEC 14882).
but this arises out of a few simple requirements:
1. the layout and sizeof an object of a particular compile-time type should be known at compile-time.
2. these have to be independent of the storage duration of the object.
3. these also have to be independent of whether the object is an element of an array, a member of a class, a base class sub-object etc.
4. object types can have alignment requirements.
5. an object should be allocated at an address that meets the alignment requirements of its object type.
6. array subscript and pointer arithmetic should work correctly.
7. the memory model of C++ should be compatible with that of C89.

here is a (poor) attempt at a formal explanation:
  1. struct A
  2. {
  3. char a ;
  4. int b ;
  5. char c ;
  6. };
the layout of A and sizeof(A) are known at compile-time and would be the same for all objects of type A, independent of the storage duration of the object. and independent of whether the object is an element of an array, a member of a class, a base class sub-object etc.

in an array, no padding can be added between elements of an array. required for pointer arithmetic (and array subscript) to work correctly.
... An object of array type contains a contiguously allocated non-empty set of N sub-objects of type T. ... IS 8.3.4/1
because of this, the sizeof(A) has to be an integral multiple of the alignof(A). and alignof(A) cannot be less than the alignof(any member of A).

for example, in an implementation where the sizeof(int) == 4 and the alignof(int) == 4, the alignof(A) == 4 and the layout of A could be
char a, 3 bytes padding, int b, char c, 3 bytes padding .

but not char a, char c, 2 bytes padding, int b.
Nonstatic data members of a (non-union) class declared without an intervening access-specifier are allocated so that later members have higher addresses within a class object. ... IS 9.2/12
(required for C compatibility)

and not 3 bytes padding, char a, int b, char c, 3 bytes padding
A pointer to a POD-struct object, suitably converted using a reinterpret_cast, points to its initial member (or if that member is a bit-field, then to the unit in which it resides) and vice versa.
[Note: There might therefore be unnamed padding within a POD-struct object, but not at its beginning, as necessary to achieve appropriate alignment. ] IS 9.2/17
(also required for C compatibility)

  1. struct B : A
  2. {
  3. char d ;
  4. };
  5.  
  6. struct C
  7. {
  8. A a ;
  9. char d ;
  10. };

unless A is empty (and empty base class or empty member optimization applies), the padding at the end of A cannot be used to accommodate the member d.
The object representation of an object of type T is the sequence of N unsigned char objects taken up by the object of type T, where N equals sizeof(T). ...
[Footnote: The intent is that the memory model of C++ is compatible with that of ISO/IEC 9899 Programming Language C. --- end foonote] IS 3.9/4
Lippman in 'The C++ Object Model' explains it very well.
Reply With Quote Quick reply to this message  
Join Date: Nov 2006
Posts: 202
Reputation: n.aggel is an unknown quantity at this point 
Solved Threads: 11
n.aggel's Avatar
n.aggel n.aggel is offline Offline
Posting Whiz in Training

Re: Alignement

 
0
  #9
Jul 21st, 2008
thanks for your help... everything makes much more sense now. the 7 requirements that vijayan posted made everything clear. There are some things that i don't quite get in the last post,
like:
alignof operator(?)

or

but not char a, char c, 2 bytes padding, int b.
but i wil read part of the C++ object model{Lippman's book}, and then if i still have these questions i will ask again!
Last edited by n.aggel; Jul 21st, 2008 at 5:04 pm.
Two roads diverged in a wood, and I— I took the one less traveled by, and that has made all the difference.

by Robert Frost the "The Road Not Taken"
Reply With Quote Quick reply to this message  
Reply

This thread has been marked solved.
Perhaps start a new thread instead?
Message:



Similar Threads
Other Threads in the C++ Forum
Thread Tools Search this Thread



About Us | Contact Us | Advertise | DaniWeb | Acceptable Use Policy | RSS Feed

©2003 - 2009 DaniWeb® LLC