Combining Byte Values to Long

Question

ashkash 0 Newbie Poster

16 Years Ago

I am trying to combine byte values into a long variable.

I have four bytes each in an unsigned char:

unsigned char a = BC;
unsigned char b = 1E;
unsigned char c = 04;
unsigned char d = 00;

I want to have it so I can put the value 00041EBC into a long variable. This would mean I would need to combine char d,c,b,a in that order to get 00041EBC which is a hex value. I am reading these bytes into the chars from an external file and need to rearrange them and put the result in a long to get the correct numerical value. Any ideas? thanks.

c

7 Contributors
25 Replies
2K Views
3 Days Discussion Span
Latest Post 16 Years Ago Latest Post by rpiper138

All 25 Replies

rpiper138 46 Newbie Poster

16 Years Ago

It doesn't matter whether the data is read from a file or hard-code in the program ? If the data is in the wrong byte order (big/little endian problem) then it will have to be fixed before that typecast line.

I believe that the original post was asking how to fix the data order. How to cast from one data type to another would be pretty basic.

Ancient Dragon commented: Yes, I missed that. :) +36

Nick Evan commented: As did I ... +10

Salem 5,265 Posting Sage

16 Years Ago

> how will n = a|(b<<8)|(c<<16)|(d<<24); be portable regardless of endian ? You still have > to shuffle the variables to match the endian setting.
Well you read the spec.

If the spec says a is the LSB and d is the MSB, then that is what you write, regardless of the endian of the machine which is doing the processing.

Likewise, if the spec says a is the MSB and d is the LSB, then you write
n = d|(c<<8)|(b<<16)|(a<<24);

Without that kind of information, simply saying "put these 4 bytes into a long word" is a pretty meaningless request.

ArkM 1,090 Postaholic

16 Years Ago

Let's remember the original post:

...I have four bytes each in an unsigned char:
unsigned char a = BC;
unsigned char b = 1E;
unsigned char c = 04;
unsigned char d = 00;
I want to have it so I can put the value 00041EBC into a long variable. This would mean I would need to combine char d,c,b,a in that order to get 00041EBC...

I think it's exactly LSB/MSB specification. The author wants to get long integer value where byte a gets 8 least significant bits, b gets the next 8 bits and so on. It's absolutely little/big-endian neutral specification.
That's why we must (and can ;)) use a portable, architecture-independent approach (with shift operator, for example - this operator effects are so portable defined in the language).

Nick Evan commented: Great discussion! +10

Dave Sinkula 2,398 long time no c

16 Years Ago

>>A long doesn't have endianness;
I disagree -- try transferring via socket a long from *nix to MS-Windows without conversion. You will wind up with crap.

How do you not see that as 'external storage'?

Nick Evan commented: Hehe :) +10

rpiper138 46 Newbie Poster

16 Years Ago

I believe that the initial question was asking about information stored in a file as a series of chars. This information was actually int16. If you want it portable then it would require conditional compilation via #ifdef.

Reply to this topic

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.

ArkM 1,090 Postaholic · Answer 1 · 2008-09-24T02:56:47+00:00

I have one and only but brilliant idea: get your textbook and read about:
1. Hexadecimal int and char literals (0xBC and '\xBC', for example).
2. Shift operators << and >>
3. Logical or/and operators | and &
That's all, folks...

rpiper138 46 Newbie Poster · Answer 2 · 2008-09-24T11:23:17+00:00

unsigned char values[4] = (char[]) 0xBC1E0400; //Read from file stream
long result;

for(int i=4;i<4;i++)
((char*)&result)[4-i] = values;

Ancient Dragon 5,243 Achieved Level 70 Team Colleague Featured Poster · Answer 3 · 2008-09-24T19:41:19+00:00

Here is one way to do it.

int main()
{
    unsigned char a[4] = {0XBC,0x1e,0x04,0x00};
    unsigned long n = *(unsigned long*)a;
}

Ancient Dragon 5,243 Achieved Level 70 Team Colleague Featured Poster · Answer 4 · 2008-09-24T19:49:47+00:00

unsigned char values[4] = (char[]) 0xBC1E0400; //Read from file stream
long result;
for(int i=4;i<4;i++)
((char*)&result)[4-i] = values;

Two problems with that code
1) the for loop will never execute because i is initialized to 4 then its testing i to see if its less than 4, which it isn't.

2) result isn't an array so you can't use array subscripts on it.

Nick Evan 4,005 Industrious Poster Team Colleague Featured Poster · Answer 5 · 2008-09-24T19:55:34+00:00

Here is one way to do it.

int main()
{
    unsigned char a[4] = {0XBC,0x1e,0x04,0x00};
    unsigned long n = *(unsigned long*)a;
}

Nice piece of code. I never knew it could be (legally) done like this.
But you're assuming that a long has 4 bytes and this isn't guaranteed to be the case right?
ISO C++ guarantees that long is always at least as big as int, but an int could be 2 bytes on some systems, or am I mistaken?

ArkM 1,090 Postaholic · Answer 6 · 2008-09-24T20:08:53+00:00

Well, one of the (almost ;)) portable ways to do that:

unsigned char a = '\xBC';
    unsigned char b = '\x1E';
    unsigned char c = '\x04';
    unsigned char d = '\x00';
    /* 00041EBC wanted */
    unsigned long n;
    n = a|(b<<8)|(c<<16)|(d<<24);

Ancient Dragon 5,243 Achieved Level 70 Team Colleague Featured Poster · Answer 7 · 2008-09-24T20:24:26+00:00

Nice piece of code. I never knew it could be (legally) done like this.
But you're assuming that a long has 4 bytes and this isn't guaranteed to be the case right?
ISO C++ guarantees that long is always at least as big as int, but an int could be 2 bytes on some systems, or am I mistaken?

The whole problem resolves around the fact that the long is 4 bytes long. Although the character array is 4 bytes, typecasting to long make no such assumption because the typecase will use however many bytes are used by the compiler to cast the array to long. If its 16-bit then the case will result is crap because the array assumed it was a 32-bit long.

ArkM 1,090 Postaholic · Answer 8 · 2008-09-24T21:03:41+00:00

And how about big/little endian problem in the Ancient Dragon's solution?..
;)

Ancient Dragon 5,243 Achieved Level 70 Team Colleague Featured Poster · Answer 9 · 2008-09-25T01:47:08+00:00

And how about big/little endian problem in the Ancient Dragon's solution?..
;)

That's only a problem if the data is transferred between computers running different operating systems. But both your example and mine have that potential problem.

ArkM 1,090 Postaholic · Answer 10 · 2008-09-25T03:05:23+00:00

Let's remember that we want to get unsigned long value 0x00041EBC. You must reverse byte array initial values on big-endian CPU (modify the example SOURCE) to get this value. No need to change the code in my example ;)...
Moreover, shift/or code works fine on 64-bit platforms too...

rpiper138 46 Newbie Poster · Answer 11 · 2008-09-25T04:54:18+00:00

Two problems with that code
1) the for loop will never execute because i is initialized to 4 then its testing i to see if its less than 4, which it isn't.
2) result isn't an array so you can't use array subscripts on it.

You are correct. Thanks. That's what I get for doing things at 2 am.
The loop can can count in either direction, but it must cover 0 through 3.
I forgot the char* assigned to &result that I was intended to index.

The following code does that you added does not take into account that the information is read from a file and not in the byte order required. I only used the array as a substitute for the data that would be pulled from the file (in the array order).

int main()
{
    unsigned char a[4] = {0XBC,0x1e,0x04,0x00};
    unsigned long n = *(unsigned long*)a;
}

rpiper138 46 Newbie Poster · Answer 12 · 2008-09-25T04:58:59+00:00

Well, one of the (almost ;)) portable ways to do that:

unsigned char a = '\xBC';
    unsigned char b = '\x1E';
    unsigned char c = '\x04';
    unsigned char d = '\x00';
    /* 00041EBC wanted */
    unsigned long n;
    n = a|(b<<8)|(c<<16)|(d<<24);

Unfortunately bit shifting is somewhat expensive from a performance stand point.

Ancient Dragon 5,243 Achieved Level 70 Team Colleague Featured Poster · Answer 13 · 2008-09-25T05:16:38+00:00

You are correct. Thanks. That's what I get for doing things at 2 am.
The loop can can count in either direction, but it must cover 0 through 3.
I forgot the char* assigned to &result that I was intended to index.
The following code does that you added does not take into account that the information is read from a file and not in the byte order required. I only used the array as a substitute for the data that would be pulled from the file (in the array order).
int main()
{
    unsigned char a[4] = {0XBC,0x1e,0x04,0x00};
    unsigned long n = *(unsigned long*)a;
}

It doesn't matter whether the data is read from a file or hard-code in the program ? If the data is in the wrong byte order (big/little endian problem) then it will have to be fixed before that typecast line.

ArkM 1,090 Postaholic · Answer 14 · 2008-09-25T11:05:02+00:00

Unfortunately bit shifting is somewhat expensive from a performance stand point.

Yes, I agree. But I do not sure that this expression is the bottleneck of the original post byte picking stuff. In real word for big/little-endian conversion we all use another tools ;)...

Salem 5,265 Posting Sage · Answer 15 · 2008-09-25T21:05:43+00:00

> Unfortunately bit shifting is somewhat expensive from a performance stand point.
How so?
All but the most simplistic processors have these nowadays.
http://en.wikipedia.org/wiki/Barrel_shifter

It's even free on some MIPS processors IIRC.

As with all code, you should first work on making it right, before worrying about making it fast. The bit-shifting approach will ALWAYS give you a predictable answer irrespective of your machines endian, or whatever alignment restrictions which might blow up the dubious casting.

Ancient Dragon 5,243 Achieved Level 70 Team Colleague Featured Poster · Answer 16 · 2008-09-25T21:11:03+00:00

how will n = a|(b<<8)|(c<<16)|(d<<24); be portable regardless of endian ? You still have to shuffle the variables to match the endian setting.

Dave Sinkula 2,398 long time no c Team Colleague · Answer 17 · 2008-09-25T21:29:22+00:00

how will n = a|(b<<8)|(c<<16)|(d<<24); be portable regardless of endian ? You still have to shuffle the variables to match the endian setting.

What makes you think that?

Ancient Dragon 5,243 Achieved Level 70 Team Colleague Featured Poster · Answer 18 · 2008-09-26T05:50:10+00:00

It may be on *nix, but put the same code on MS-Windows without changing the code and it will produce the wrong answer. That makes the statement false absolutely little/big-endian neutral specification. because the same code doesn't work right on both machines.

Or maybe I'm just being pigheaded :)

Dave Sinkula 2,398 long time no c Team Colleague · Answer 19 · 2008-09-26T07:47:01+00:00

A long doesn't have endianness; external storage of a long does. If the component byte(s) composing said long do not come from external storage, endianness does not enter the picture.

Ancient Dragon 5,243 Achieved Level 70 Team Colleague Featured Poster · Answer 20 · 2008-09-26T08:57:04+00:00

>>A long doesn't have endianness;
I disagree -- try transferring via socket a long from *nix to MS-Windows without conversion. You will wind up with crap.

Combining Byte Values to Long

Recommended Answers Collapse Answers

All 25 Replies

Recommended Answers