I spent a few hours grappling with data corruption until I found something interesting.
My code was getting crunched after a certain point because of an sprintf overwriting the first byte of the array right after it.

After some research, I found something interesting and irritating.

According to MSDN,

Security Note There is no way to limit the number of characters written, which means that code using sprintf is susceptible to buffer overruns. Consider using the related function _snprintf, which specifies a maximum number of characters to be written to buffer, or use _scprintf to determine how large a buffer is required. Also, ensure that format is not a user-defined string.

Oh, glee and rapture. This problem was ignored in the command line version because the same area was overwritten each time. But in this new world of GUI, it's a problem, particularly because the array (here, array2) is a time difference that increases in length, and doesn't have a problem before it passes 100 seconds.

My code approximates to this:

struct foobar
{
   array1[25];
   array2[16];
};


int main (void)
{
 struct foobar foo;

	foo.array2[0] = '0'; foo.array2[1] = '0'; foo.array2[2] = ':';
	foo.array2[3] = '0'; foo.array2[4] = '0'; foo.array2[5] = ':';
	sprintf(&(foo.array2[6]), "%9f", diff/(1.0e6));	//This overwrites by one byte.
       //Function that writes to array 1 goes here (also using sprintfs, but with enough of a buffer that the danger is minimized, or something.)
}

If the time in array2 gets too large, it overflows the buffer and corrupts the variable that follows it.

I have managed to get this under control by placing the sprintf and its array as shown above. This way, the function writing array1 can't be overwritten by the already-completed array2 sprintf, and array2 has been placed at the end of the struct so that it won't be able to overflow into another variable.

But this is a cop out.

How can I fix this without relying on placement of code? I don't know when the time will roll over to 100 because of the data's nature. I also don't know what the maximum size will be on the time difference (from the start), so I can't use snprintf or scprintf. Of course, those don't seem to be in the libraries available to me, either...

Thanks in advance.

Recommended Answers

All 8 Replies

I assume you are writing a C program, not C++. If this is a c++ program then you should not be using C-style buffers at all -- use std::string and std::stringstream.

One way to avoid buffer overflow is to make the arrays bigger -- there is nothing at all wrong with having buffers that are a little bit bigger than needed. So what if you waste a few bytes (unless you are allocating thousands of those structures) -- that is a lot better than too small buffers which cause buffer overflows.

you don't have to set each element of the array individually like you show. Just do it like this

sprintf(foo.array2, "00:00:%9f", diff/(1.0e6));

>This problem was ignored in the command line version
No, you were just unlucky enough that it didn't fail with as sirens, bells, and confetti.

>But this is a cop out.
It's still wrong because structure members are allowed to have padding between them and the padding is off-limits. If you overflow into the padding, you still invoke undefined behavior and the problem still exists.

>How can I fix this without relying on placement of code?
You could always make the array big enough for the longest possible values. Or you could calculate the length of the values, then dynamically size a simulated array to fit that length, plus your formatting.

I wasn't happy with my "Solution", either, which is why I asked, and called the code a 'cop out'.

To quote the boss: "Memory's Cheap", so even though that means storing on the order of 64kB more memory in arrays, the system still has a lot. I'll expand the arrays, and switch back to a configuration where the failures become quite evident. I figured that's how I'd have to do it, and hopefully the memory required won't become too hideous...

Thanks for the help.

>To quote the boss: "Memory's Cheap"
Your boss is clueless. Yes, memory may be cheap, and these days there is a lot of it. However, that doesn't change the fact that if you use too much, you'll page into virtual memory and your program will grind to a screeching halt. Also, the more memory you use, the greater your chances of encountering excessive cache misses and noticeably slowing your program.

So yea, memory is cheap, but there's also no such thing as a free lunch.

>To quote the boss: "Memory's Cheap"
Your boss is clueless. Yes, memory may be cheap, and these days there is a lot of it. However, that doesn't change the fact that if you use too much, you'll page into virtual memory and your program will grind to a screeching halt. Also, the more memory you use, the greater your chances of encountering excessive cache misses and noticeably slowing your program.

So yea, memory is cheap, but there's also no such thing as a free lunch.

Well, you have to be practicle about it. One wouldn't allocate 10 mg buffer just to hold 16 bytes! "memory is cheap" means you don't have to be too stingy about allocating a few more bytes of memory than needed. Such was the case in the OPs erxample -- one byte too few can cause days or weeks of debugging (been there, done that!). Just allocate one or two more bytes than really needed will save slitting your wrists trying to find the bug.

>"memory is cheap" means you don't have to be too stingy about
>allocating a few more bytes of memory than needed.
True, but you still have to be practical about it. If you take the "memory is cheap" approach throughout the system, a few more bytes in every routine can add up. It's better to figure out exactly how much your upper limit will be and if you feel the need, allocate that much memory for all cases. At least that way you will understand your memory needs better than if you just add "a few more bytes" until the problem appears to go away.

>Just allocate one or two more bytes than really needed will save
>slitting your wrists trying to find the bug.
And if that isn't enough for all cases? You'll have created an even more subtle bug, and what's worse than slitting your wrists trying to find a bug is not even knowing that the bug is there. I'm sorry, but it seems too much like you're advocating ignorance about the program's memory needs for me to agree with you. I can count the number of times that I've blindly allocated too much without knowing exactly what the most I needed was on one hand. It's no accident that I produce very few bugs in my code.

I agree that allocating a larger buffer is not going to solve all the world's problems, but it will solve the problem of the OP by using the "memory is cheap" approach. sprintf() and other similar C functions need as large of buffer space as you can give it. Giving it too small buffer will cause large headaches.

For that reason, programmers who use c++ compilers that support STL (and not all of c++ compilers do) should not even use those C functions, but use std::stringstream and std::string instead. Those class will avoid all those memory problems, unless you want to suck-up every byte you can out of your program. That is a general rule, and there are many cases where c-style character arrays cannot be avoided.

Believe me; I'm not happy with this approach, either. I've thrown in an assert (verify, actually) to make the program fail in a tracable way if it starts approaching a dangerous timestring.
I've also heavily commented the problem, and I'll include it in the documentation.

EDIT: Looks like verify isn't going to work as I wanted it to. I'll have try something else (Modal dialog box, for example) to warn the user in the release build.


I don't like how long I have to wait for this application to load correctly. I've got a sort of 'scrolling buffer' memory allocation system in mind... I think I've talked about this before. It's just a way to display a limited section of the file at a time, reusing the same memory and storing byte offsets to move backwards. It would drastically reduce the memory requirements and apparent access time, but it would be more difficult to code.

However, since even moving the display window forces a potentially monsterous redraw, it may be necessary to switch to my memory allocation system anyway.

Bleh. I don't like "Memory's cheap". It's a pretty dangerous line of thought.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.