I am trying to combine byte values into a long variable.

I have four bytes each in an unsigned char:

unsigned char a = BC;
unsigned char b = 1E;
unsigned char c = 04;
unsigned char d = 00;

I want to have it so I can put the value 00041EBC into a long variable. This would mean I would need to combine char d,c,b,a in that order to get 00041EBC which is a hex value. I am reading these bytes into the chars from an external file and need to rearrange them and put the result in a long to get the correct numerical value. Any ideas? thanks.

Recommended Answers

All 25 Replies

I have one and only but brilliant idea: get your textbook and read about:
1. Hexadecimal int and char literals (0xBC and '\xBC', for example).
2. Shift operators << and >>
3. Logical or/and operators | and &
That's all, folks...

unsigned char values[4] = (char[]) 0xBC1E0400; //Read from file stream
long result;

for(int i=4;i<4;i++)
((char*)&result)[4-i] = values;

Here is one way to do it.

int main()
{
    unsigned char a[4] = {0XBC,0x1e,0x04,0x00};
    unsigned long n = *(unsigned long*)a;
}

unsigned char values[4] = (char[]) 0xBC1E0400; //Read from file stream
long result;

for(int i=4;i<4;i++)
((char*)&result)[4-i] = values;

Two problems with that code
1) the for loop will never execute because i is initialized to 4 then its testing i to see if its less than 4, which it isn't.

2) result isn't an array so you can't use array subscripts on it.

Here is one way to do it.

int main()
{
    unsigned char a[4] = {0XBC,0x1e,0x04,0x00};
    unsigned long n = *(unsigned long*)a;
}

Nice piece of code. I never knew it could be (legally) done like this.
But you're assuming that a long has 4 bytes and this isn't guaranteed to be the case right?
ISO C++ guarantees that long is always at least as big as int, but an int could be 2 bytes on some systems, or am I mistaken?

Well, one of the (almost ;)) portable ways to do that:

unsigned char a = '\xBC';
    unsigned char b = '\x1E';
    unsigned char c = '\x04';
    unsigned char d = '\x00';
    /* 00041EBC wanted */
    unsigned long n;
    n = a|(b<<8)|(c<<16)|(d<<24);

Nice piece of code. I never knew it could be (legally) done like this.
But you're assuming that a long has 4 bytes and this isn't guaranteed to be the case right?
ISO C++ guarantees that long is always at least as big as int, but an int could be 2 bytes on some systems, or am I mistaken?

The whole problem resolves around the fact that the long is 4 bytes long. Although the character array is 4 bytes, typecasting to long make no such assumption because the typecase will use however many bytes are used by the compiler to cast the array to long. If its 16-bit then the case will result is crap because the array assumed it was a 32-bit long.

And how about big/little endian problem in the Ancient Dragon's solution?..
;)

And how about big/little endian problem in the Ancient Dragon's solution?..
;)

That's only a problem if the data is transferred between computers running different operating systems. But both your example and mine have that potential problem.

Let's remember that we want to get unsigned long value 0x00041EBC. You must reverse byte array initial values on big-endian CPU (modify the example SOURCE) to get this value. No need to change the code in my example ;)...
Moreover, shift/or code works fine on 64-bit platforms too...

Two problems with that code
1) the for loop will never execute because i is initialized to 4 then its testing i to see if its less than 4, which it isn't.

2) result isn't an array so you can't use array subscripts on it.

You are correct. Thanks. That's what I get for doing things at 2 am.
The loop can can count in either direction, but it must cover 0 through 3.
I forgot the char* assigned to &result that I was intended to index.

The following code does that you added does not take into account that the information is read from a file and not in the byte order required. I only used the array as a substitute for the data that would be pulled from the file (in the array order).

int main()
{
    unsigned char a[4] = {0XBC,0x1e,0x04,0x00};
    unsigned long n = *(unsigned long*)a;
}

Well, one of the (almost ;)) portable ways to do that:

unsigned char a = '\xBC';
    unsigned char b = '\x1E';
    unsigned char c = '\x04';
    unsigned char d = '\x00';
    /* 00041EBC wanted */
    unsigned long n;
    n = a|(b<<8)|(c<<16)|(d<<24);

Unfortunately bit shifting is somewhat expensive from a performance stand point.

You are correct. Thanks. That's what I get for doing things at 2 am.
The loop can can count in either direction, but it must cover 0 through 3.
I forgot the char* assigned to &result that I was intended to index.

The following code does that you added does not take into account that the information is read from a file and not in the byte order required. I only used the array as a substitute for the data that would be pulled from the file (in the array order).

int main()
{
    unsigned char a[4] = {0XBC,0x1e,0x04,0x00};
    unsigned long n = *(unsigned long*)a;
}

It doesn't matter whether the data is read from a file or hard-code in the program ? If the data is in the wrong byte order (big/little endian problem) then it will have to be fixed before that typecast line.

It doesn't matter whether the data is read from a file or hard-code in the program ? If the data is in the wrong byte order (big/little endian problem) then it will have to be fixed before that typecast line.

I believe that the original post was asking how to fix the data order. How to cast from one data type to another would be pretty basic.

commented: Yes, I missed that. :) +36
commented: As did I ... +10

Unfortunately bit shifting is somewhat expensive from a performance stand point.

Yes, I agree. But I do not sure that this expression is the bottleneck of the original post byte picking stuff. In real word for big/little-endian conversion we all use another tools ;)...

> Unfortunately bit shifting is somewhat expensive from a performance stand point.
How so?
All but the most simplistic processors have these nowadays.
http://en.wikipedia.org/wiki/Barrel_shifter

It's even free on some MIPS processors IIRC.

As with all code, you should first work on making it right, before worrying about making it fast. The bit-shifting approach will ALWAYS give you a predictable answer irrespective of your machines endian, or whatever alignment restrictions which might blow up the dubious casting.

how will n = a|(b<<8)|(c<<16)|(d<<24); be portable regardless of endian ? You still have to shuffle the variables to match the endian setting.

how will n = a|(b<<8)|(c<<16)|(d<<24); be portable regardless of endian ? You still have to shuffle the variables to match the endian setting.

What makes you think that?

> how will n = a|(b<<8)|(c<<16)|(d<<24); be portable regardless of endian ? You still have > to shuffle the variables to match the endian setting.
Well you read the spec.

If the spec says a is the LSB and d is the MSB, then that is what you write, regardless of the endian of the machine which is doing the processing.

Likewise, if the spec says a is the MSB and d is the LSB, then you write
n = d|(c<<8)|(b<<16)|(a<<24);

Without that kind of information, simply saying "put these 4 bytes into a long word" is a pretty meaningless request.

Let's remember the original post:

...I have four bytes each in an unsigned char:

unsigned char a = BC;
unsigned char b = 1E;
unsigned char c = 04;
unsigned char d = 00;

I want to have it so I can put the value 00041EBC into a long variable. This would mean I would need to combine char d,c,b,a in that order to get 00041EBC...

I think it's exactly LSB/MSB specification. The author wants to get long integer value where byte a gets 8 least significant bits, b gets the next 8 bits and so on. It's absolutely little/big-endian neutral specification.
That's why we must (and can ;)) use a portable, architecture-independent approach (with shift operator, for example - this operator effects are so portable defined in the language).

commented: Great discussion! +10

It may be on *nix, but put the same code on MS-Windows without changing the code and it will produce the wrong answer. That makes the statement false absolutely little/big-endian neutral specification. because the same code doesn't work right on both machines.

Or maybe I'm just being pigheaded :)

commented: Great discussion! +10

A long doesn't have endianness; external storage of a long does. If the component byte(s) composing said long do not come from external storage, endianness does not enter the picture.

>>A long doesn't have endianness;
I disagree -- try transferring via socket a long from *nix to MS-Windows without conversion. You will wind up with crap.

>>A long doesn't have endianness;
I disagree -- try transferring via socket a long from *nix to MS-Windows without conversion. You will wind up with crap.

How do you not see that as 'external storage'?

commented: Hehe :) +10

I believe that the initial question was asking about information stored in a file as a series of chars. This information was actually int16. If you want it portable then it would require conditional compilation via #ifdef.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.