A few questions about binary i/o..
I have a few ( slightly related ) questions about binary i/o in C++, I can't seem to find full answers to these anywhere..
- Is there any way to tell if an i/ostream passed to a function has been opened in binary or text mode?
- If not, are there any nuances to look out for when using put/read/write methods on an i/ostream that was opened in text mode? Would a process that assumes an i/ostream was binary, and worked 'correctly' in that case, be in any way incorrect if the stream it used was instead opened in text mode?
- How does endian-ness work? If two bytes A and B are written ( in that order ) into a file using put, then read on a machine with reversed endian-ness, do the bytes come out in a different order ( i.e. B, A ), or do they come out in the order A, B, but with the actual bits themselves being backwards? Or does the ( binary mode ) i/ostream normalize this so that it's not a problem? What if the bytes are written using ostream.write directly from an array reinterpret_casted to a char * ? ( and then read using istream.read )
MattEvans
Veteran Poster
1,386 posts since Jul 2006
Reputation Points: 522
Solved Threads: 64
Thanks for the info on storing things into the i/ostream directly, that could come in useful elsewhere :P, but, the reason I wanted to determine the open mode is to stop a method being inadvertently called with a stream in the wrong mode.. so, that won't work.. but I seem to get the same results in light testing when writing to a stream opened in 'the wrong mode' anyway.. hence my second question.
The endian-ness thing, I'm still not sure I get it... a more specific example of what I'm doing, writing 32-bit floats to and from binary files, I don't have access to a machine with a different endian-ness.. But, lets say I work on a PC, and I want my files to open in a Mac ( which is apparently reverse endianess to PC ).. [ I am happy to assume that floats are 32-bit on my targets.. and that they're represented the same... although, is that even a safe assumption on Windows, Mac, Linux? See I really don't want to have to resort to a dodgy handrolled file representation for non-integers, and using a text representation for floating point arrays never really appealled to me. ]
Anyway, assuming that the representation is the same.. if I do this:
int main( void )
{
std::ofstream out( "test.bin", std::ios::binary | std::ios::out | std::ios::trunc );
const float f1 = 123.456f;
out.write( reinterpret_cast< const char * >( &f1 ), sizeof( float ) );
out.close( );
std::ifstream in( "test.bin", std::ios::binary | std::ios::in );
float f2 = 0.0f;
in.read( reinterpret_cast< char * >( &f2 ), sizeof( float ) );
in.close( );
assert( f1 == f2 );
}
Will it work ( i.e. will the read value be equal the written value ) if the second half of that process is done on a reverse-endian machine? Or is it better ( or no change atall ) to do this?
int main( void )
{
std::ofstream out( "test.bin", std::ios::binary | std::ios::out | std::ios::trunc );
const float f1 = 123.456f;
const char * c1 = reinterpret_cast< const char * >( &f1 );
for( size_t i = 0; i < sizeof( float ); ++i ) {
out.put( c1[ i ] );
}
out.close( );
std::ifstream in( "test.bin", std::ios::binary | std::ios::in );
float f2 = 0.0f;
char * c2 = reinterpret_cast< char * >( &f2 );
for( size_t i = 0; i < sizeof( float ); ++i ) {
c2[ i ] = in.get( );
}
in.close( );
assert( f1 == f2 );
}
Or, is it necessary to reverse the order of the bits in each byte read if it is determined that the machine reading the file is 'backwards' ?
MattEvans
Veteran Poster
1,386 posts since Jul 2006
Reputation Points: 522
Solved Threads: 64
I'm aware of potential issues involved when directly writing struct values to binary, so I already break the objects into primitives, but the smallest primitive I can break down to is a float, I'm happy to do this, it makes a certain amount of sense in a context which is basically, a load of arrays of float and uint32_t, with only a very tiny amount of considered structure ( a 16 byte header for each group of arrays, which may be hundreds/thousands of elements long ). The data is certainly not easily human-readable/writeable, so I wouldn't gain that usual advantage of a text format, only the portability. In any other situation though, I'd certainly go for a text format.
All I really need is a ( somewhat ) platform independant way of storing individual floats in binary format, I say somewhat platform independant, since PC and Mac are the only targets I'm focusing on.
But, based on what you've said, I guess that the second piece of code I posted would work ( reading and writing each byte of each float one at a time using get/put, and always reconstructing in the same order ), as long as floats are 32-bit, and are represented in the same way at bit-level on the target platforms...
MattEvans
Veteran Poster
1,386 posts since Jul 2006
Reputation Points: 522
Solved Threads: 64
- How does endian-ness work? If two bytes A and B are written ( in that order ) into a file using put, then read on a machine with reversed endian-ness, do the bytes come out in a different order ( i.e. B, A ), or do they come out in the order A, B, but with the actual bits themselves being backwards?
It's all about the interpretation of the bytes when loaded into memory that matters, not how they're stored. If the first machine writes A,B and the second machine reads A,B, the values will be interpreted differently because to the first machine A,B means A,B, but to the second machine A,B means B,A.
No, it's about how they are stored. The bytes will be backwardsin the file when written by a big-endian system but read by a little-endian system.
Matt, see this
WaltP
Posting Sage w/ dash of thyme
10,506 posts since May 2006
Reputation Points: 3,348
Solved Threads: 944
That's another way of saying the same thing. :)
No really. You said "It's all about the interpretation of the bytes when loaded into memory that matters,not how they're stored." I'm saying you must know how they are stored or you can't properly read the data.
I think I know what you were trying to say, but it wasn't clear.
WaltP
Posting Sage w/ dash of thyme
10,506 posts since May 2006
Reputation Points: 3,348
Solved Threads: 944
Thank-you both, I read through that wikipedia article ( again =p ) and also this page from IBM http://www.ibm.com/developerworks/aix/library/au-endianc/index.html?ca=drs- , and I get it now. All I need to decide now is whether I want to force one endian-ness in the file format, or do something at the beginning of the file to indicate endian-ness. I can live with the 32-bit float assumption.. at least until I find an un-ignorable platform where there isn't a 32-bit floating point type...
MattEvans
Veteran Poster
1,386 posts since Jul 2006
Reputation Points: 522
Solved Threads: 64