Why is accessing data in an uninitialised array undefined behaviour?

I ask because I was doing some benchmarking on vector<int> vs. int* vs. int**. My results were: http://stackoverflow.com/questions/20863478/class-using-stdvector-allocation-slower-than-pointer-allocation-by-a-lot

Now I'm being told if I use the raw array and try to access an index without initialising it first, it is undefined behaviour and that I must initialise it. The thing is, if I initialise it, it takes 43 seconds compared to 5 seconds. I'm asking here because I always get better answers here instead of stackoverflow and they are arguing over whether I'm allowed to do it or not.

Any ideas what I could do to keep the speed of uninitialising without breaking any rules?

Recommended Answers

All 6 Replies

What do you mean by "access" in this context?

Why is accessing data in an uninitialised array undefined behaviour?

It's not really clear whether this is undefined behavior or not. If you read uninitialized data, you will read a garbage value, i.e., whatever arbitrary combination of bits that happens to be in that memory location at the time of the allocation. That's what's going to happen in pretty much 99.9999..% of platforms (unless you compile in debug mode where "uninitialized" data is often initialized for easily spotting such errors).

Undefined behavior is a technical term pertaining to the C++ standard. What the standard says here is that if you dynamically allocate an array, the elements get "default-initialized", which is the standard term to mean "if class type, then default constructor is called; if trivial type (e.g., int), then left uninitialized". Then the standard mentions that uninitialized primitive values have "indeterminate value", which (to me, at least) implies that reading that value is well-defined behavior, it's just the value that is indeterminate. In that case, it's well-defined behavior, it's just that the behavior is to obtain a "garbage" value.

Anyhow, these are just SO-style language-lawyers' discussions.

Any ideas what I could do to keep the speed of uninitialising without breaking any rules?

The rule is that you cannot meaningfully (UB or not) read the value of an uninitialized variable (part of an array or not). But, obviously, you can write to it. You could pre-allocate an array before you come to the point of having values to initialize it with. But obviously, you can't have an uninitialized array and then read its values.

When using std::vector, the equivalent to pre-allocating the array while leaving it uninitialized is to use the reserve() function, which does exactly that. When you have values to give to the elements of the array, you can just use a series of push_back operations to put in the valid values. Otherwise, the behavior of std::vector is to always copy-construct all the new elements when using a resize() or a number-of-elements constructor.

Why is accessing data in an uninitialised array undefined behaviour?

It really depends on where the array is defined. If it's a local array then it's undefined behavior just as accessing an uninitialized variable is undefined behavior. The contents of the array are indeterminate, which is the "official" reason for the undefined behavior.

The practical reason is that accessing random bits can trigger something like a trap representation and totally blow up your application simply by looking at a value.

Any ideas what I could do to keep the speed of uninitialising without breaking any rules?

I'd need more details on what you're using the array for. Restructuring the array so you only access it sequentially rather than randomly would be a good start, but that's not always appropriate for the problem.

I was just using the array as a matrix. I don't mind indeterminate values at all if I access an array with no initial value. I just don't want my program crashing or me doing something unsafe because of some undefined behaviour that I don't understand.

I don't know why accessing an uninitialised array would be undefined. I understand indeterminate but not undefined. I don't understand traps by just looking at it either. I'm not sure why I would be able to write to it but not read from it at all. If I can write to it, why can't I read garbage values from it willingly?

I asked someone why it was undefined and they said to me that the OS only gives memory to a variable when it has a value otherwise nothing happens :S

Thank you for your replies.

I don't mind indeterminate values at all if I access an array with no initial value. I just don't want my program crashing because of some undefined behaviour that I don't understand.

Those two sentences are contradictory.

I don't understand traps by just looking at it either.

Imagine calling abort, but without the happy result for a user. ;) I wouldn't wish troubleshooting a trap representation on even the worst of my enemies.

I asked someone why it was undefined and they said to me that the OS only gives memory to a variable when it has a value otherwise nothing happens :S

Wow, that was a stupid answer. Whoever told you that is quite confused. You get memory, you just can't necessarily predict the contents of that memory.

The contents of the array are indeterminate, which is the "official" reason for the undefined behavior.

There is obviously a gray zone between indeterminate values and undefined behavior. If it wasn't for trap representations (which I doubt still exist on any hardware at all), there would really be no reason why reading a variable of indeterminate value would cause undefined behavior. It might be the case that, very strictly-speaking, this is UB, but certainly not in practice.

And from what I have read on trap representations (which I just learned about in this thread), it seems that (1) in C++, trap representations only apply to floating-point numbers, and that (2) any IEEE-compliant processor must not have trap representations on floating-point numbers. And also, even C99 integer trap representations seem so rare that even the committee members looked for example architectures and could not find any. I think that the standards (C/C++) could be re-worded to say that it is well-defined behavior to read an indeterminate value (such as an uninitialized variable), and it would not change anything at all for any implementation or architectures that anyone still uses (it seems the last processor to have a trap representation on floating point numbers was an old DSP chip from the 80s).

Is there any other reason why reading a indeterminate value would be undefined behavior?

I don't mind indeterminate values at all if I access an array with no initial value. I just don't want my program crashing because of some undefined behaviour that I don't understand.

Those two sentences are contradictory.

I don't see a contradiction at all. There is a huge difference between UB and a garbage value. I can certainly understand the OP's concerns about wanting to know if it is one way or the other. UB means, ridiculously-speaking, that your computer could spit out a ham sandwich, for all we know. But a garbage value is just a garbage value. I could easily think of examples where garbage values would not be a big problem (e.g., doing block copy of a partially initialized memory buffer over a network or file, not caring for the excess "junk" data, as long as reading it is not UB).

the OS only gives memory to a variable when it has a value

Whoever told you that is quite confused.

That person almost seems to describe a pure functional programming language, where values and variables are one and the same, i.e., a variable comes into existence when a value comes into existence, and then, it never changes. Anyways, that's not relevant here.

Be a part of the DaniWeb community

We're a friendly, industry-focused community of developers, IT pros, digital marketers, and technology enthusiasts meeting, networking, learning, and sharing knowledge.