0

I am trying to combine byte values into a long variable.

I have four bytes each in an unsigned char:

unsigned char a = BC;
unsigned char b = 1E;
unsigned char c = 04;
unsigned char d = 00;

I want to have it so I can put the value 00041EBC into a long variable. This would mean I would need to combine char d,c,b,a in that order to get 00041EBC which is a hex value. I am reading these bytes into the chars from an external file and need to rearrange them and put the result in a long to get the correct numerical value. Any ideas? thanks.

7
Contributors
25
Replies
26
Views
8 Years
Discussion Span
Last Post by rpiper138
Featured Replies
  • 1
    ArkM 1,090   8 Years Ago

    Let's remember the original post: [quote]...I have four bytes each in an unsigned char: unsigned char a = BC; unsigned char b = 1E; unsigned char c = 04; unsigned char d = 00; I want to have it so I can put the value 00041EBC into a long variable. … Read More

  • [QUOTE=Ancient Dragon;699190]>>A long doesn't have endianness; I disagree -- try transferring via [COLOR="Red"]socket[/COLOR] a long from *nix to MS-Windows without conversion. You will wind up with crap.[/QUOTE]How do you not see that as 'external storage'? Read More

0

I have one and only but brilliant idea: get your textbook and read about:
1. Hexadecimal int and char literals (0xBC and '\xBC', for example).
2. Shift operators << and >>
3. Logical or/and operators | and &
That's all, folks...

0

unsigned char values[4] = (char[]) 0xBC1E0400; //Read from file stream
long result;

for(int i=4;i<4;i++)
((char*)&result)[4-i] = values;

0

unsigned char values[4] = (char[]) 0xBC1E0400; //Read from file stream
long result;

for(int i=4;i<4;i++)
((char*)&result)[4-i] = values;

Two problems with that code
1) the for loop will never execute because i is initialized to 4 then its testing i to see if its less than 4, which it isn't.

2) result isn't an array so you can't use array subscripts on it.

0

Here is one way to do it.

int main()
{
    unsigned char a[4] = {0XBC,0x1e,0x04,0x00};
    unsigned long n = *(unsigned long*)a;
}

Nice piece of code. I never knew it could be (legally) done like this.
But you're assuming that a long has 4 bytes and this isn't guaranteed to be the case right?
ISO C++ guarantees that long is always at least as big as int, but an int could be 2 bytes on some systems, or am I mistaken?

0

Well, one of the (almost ;)) portable ways to do that:

unsigned char a = '\xBC';
    unsigned char b = '\x1E';
    unsigned char c = '\x04';
    unsigned char d = '\x00';
    /* 00041EBC wanted */
    unsigned long n;
    n = a|(b<<8)|(c<<16)|(d<<24);
0

Nice piece of code. I never knew it could be (legally) done like this.
But you're assuming that a long has 4 bytes and this isn't guaranteed to be the case right?
ISO C++ guarantees that long is always at least as big as int, but an int could be 2 bytes on some systems, or am I mistaken?

The whole problem resolves around the fact that the long is 4 bytes long. Although the character array is 4 bytes, typecasting to long make no such assumption because the typecase will use however many bytes are used by the compiler to cast the array to long. If its 16-bit then the case will result is crap because the array assumed it was a 32-bit long.

0

And how about big/little endian problem in the Ancient Dragon's solution?..
;)

That's only a problem if the data is transferred between computers running different operating systems. But both your example and mine have that potential problem.

0

Let's remember that we want to get unsigned long value 0x00041EBC. You must reverse byte array initial values on big-endian CPU (modify the example SOURCE) to get this value. No need to change the code in my example ;)...
Moreover, shift/or code works fine on 64-bit platforms too...

0

Two problems with that code
1) the for loop will never execute because i is initialized to 4 then its testing i to see if its less than 4, which it isn't.

2) result isn't an array so you can't use array subscripts on it.

You are correct. Thanks. That's what I get for doing things at 2 am.
The loop can can count in either direction, but it must cover 0 through 3.
I forgot the char* assigned to &result that I was intended to index.

The following code does that you added does not take into account that the information is read from a file and not in the byte order required. I only used the array as a substitute for the data that would be pulled from the file (in the array order).

int main()
{
    unsigned char a[4] = {0XBC,0x1e,0x04,0x00};
    unsigned long n = *(unsigned long*)a;
}
0

Well, one of the (almost ;)) portable ways to do that:

unsigned char a = '\xBC';
    unsigned char b = '\x1E';
    unsigned char c = '\x04';
    unsigned char d = '\x00';
    /* 00041EBC wanted */
    unsigned long n;
    n = a|(b<<8)|(c<<16)|(d<<24);

Unfortunately bit shifting is somewhat expensive from a performance stand point.

0

You are correct. Thanks. That's what I get for doing things at 2 am.
The loop can can count in either direction, but it must cover 0 through 3.
I forgot the char* assigned to &result that I was intended to index.

The following code does that you added does not take into account that the information is read from a file and not in the byte order required. I only used the array as a substitute for the data that would be pulled from the file (in the array order).

int main()
{
    unsigned char a[4] = {0XBC,0x1e,0x04,0x00};
    unsigned long n = *(unsigned long*)a;
}

It doesn't matter whether the data is read from a file or hard-code in the program ? If the data is in the wrong byte order (big/little endian problem) then it will have to be fixed before that typecast line.

2

It doesn't matter whether the data is read from a file or hard-code in the program ? If the data is in the wrong byte order (big/little endian problem) then it will have to be fixed before that typecast line.

I believe that the original post was asking how to fix the data order. How to cast from one data type to another would be pretty basic.

Votes + Comments
As did I ...
Yes, I missed that. :)
0

Unfortunately bit shifting is somewhat expensive from a performance stand point.

Yes, I agree. But I do not sure that this expression is the bottleneck of the original post byte picking stuff. In real word for big/little-endian conversion we all use another tools ;)...

0

> Unfortunately bit shifting is somewhat expensive from a performance stand point.
How so?
All but the most simplistic processors have these nowadays.
http://en.wikipedia.org/wiki/Barrel_shifter

It's even free on some MIPS processors IIRC.

As with all code, you should first work on making it right, before worrying about making it fast. The bit-shifting approach will ALWAYS give you a predictable answer irrespective of your machines endian, or whatever alignment restrictions which might blow up the dubious casting.

0

how will n = a|(b<<8)|(c<<16)|(d<<24); be portable regardless of endian ? You still have to shuffle the variables to match the endian setting.

0

how will n = a|(b<<8)|(c<<16)|(d<<24); be portable regardless of endian ? You still have to shuffle the variables to match the endian setting.

What makes you think that?

0

> how will n = a|(b<<8)|(c<<16)|(d<<24); be portable regardless of endian ? You still have > to shuffle the variables to match the endian setting.
Well you read the spec.

If the spec says a is the LSB and d is the MSB, then that is what you write, regardless of the endian of the machine which is doing the processing.

Likewise, if the spec says a is the MSB and d is the LSB, then you write
n = d|(c<<8)|(b<<16)|(a<<24);

Without that kind of information, simply saying "put these 4 bytes into a long word" is a pretty meaningless request.

1

Let's remember the original post:

...I have four bytes each in an unsigned char:

unsigned char a = BC;
unsigned char b = 1E;
unsigned char c = 04;
unsigned char d = 00;

I want to have it so I can put the value 00041EBC into a long variable. This would mean I would need to combine char d,c,b,a in that order to get 00041EBC...

I think it's exactly LSB/MSB specification. The author wants to get long integer value where byte a gets 8 least significant bits, b gets the next 8 bits and so on. It's absolutely little/big-endian neutral specification.
That's why we must (and can ;)) use a portable, architecture-independent approach (with shift operator, for example - this operator effects are so portable defined in the language).

Votes + Comments
Great discussion!
0

It may be on *nix, but put the same code on MS-Windows without changing the code and it will produce the wrong answer. That makes the statement false absolutely little/big-endian neutral specification. because the same code doesn't work right on both machines.

Or maybe I'm just being pigheaded :)

Votes + Comments
Great discussion!
0

A long doesn't have endianness; external storage of a long does. If the component byte(s) composing said long do not come from external storage, endianness does not enter the picture.

0

>>A long doesn't have endianness;
I disagree -- try transferring via socket a long from *nix to MS-Windows without conversion. You will wind up with crap.

1

>>A long doesn't have endianness;
I disagree -- try transferring via socket a long from *nix to MS-Windows without conversion. You will wind up with crap.

How do you not see that as 'external storage'?

Votes + Comments
Hehe :)
0

I believe that the initial question was asking about information stored in a file as a series of chars. This information was actually int16. If you want it portable then it would require conditional compilation via #ifdef.

This topic has been dead for over six months. Start a new discussion instead.
Have something to contribute to this discussion? Please be thoughtful, detailed and courteous, and be sure to adhere to our posting rules.