The Difference Between Unspecified and Undefined Behaviour


What is the output of this program?

int main()
{
    int array[] = {1,2,3};
    cout << array[3] << endl;
}

Answer: Noone knows!

What is the output of this program?

void f(int i, int j){}

int foo()
{
    cout << "foo ";
    return 42;
}

int bar()
{
    cout << "bar ";
    return 42;
}

int main()
{
    f(foo(), bar());
}

Answer: Noone knows!

There is a difference in the severity of uncertainty though. The first case results in undefined behaviour (because we are indexing outside of the array), whereas the second results in unspecified behaviour (because we don’t know the order in which the function arguments will be evaluated). What is the difference?

In the case of undefined behaviour, we are screwed. Anything can happen, from what you thought should happen, to the program sending threatening letters to your neighbour’s cat. Probably it will read the memory right after where the array is stored, interpret whatever garbage is there and print it, but there is no way to know this.

In the case of unspecified behaviour however, we are probably OK. The implementation is allowed to choose from a set of well-defined behaviours. In our case, there are two possibilities, calling foo() then bar(), or bar() then foo(). Note that if foo() and bar() have some side-effects that we rely on being executed in a specific order, this unspecified behaviour would still mean we have a bug in our code.

To summarize, never write code that results in undefined behaviour, and never write code that relies on unspecified behaviour.

If you enjoyed this post, you can subscribe to my blog, or follow me on Twitter.

10 thoughts on “The Difference Between Unspecified and Undefined Behaviour

  1. And unspecified behavior can quickly lead to undefined behavior. For example, execution order is unspecified between sequence points, and changing the same variable more than once between sequence points is undefined. So if you change the arguments passed to f() to

    int i = 0;
    f(i++, i++);
    

    then the unspecified nature of the argument evaluation order leads to undefined behavior!

    I recommend reading this and the two following posts from the LLLVM blog: http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html

  2. It’s a good thing that you bring this up; I believe the two concepts are often misunderstood.

    However, your example of undefined behavior actually isn’t that. Reading the element one-past-the-end of an array is perfectly legal. And it has to be – otherwise the standard library algorithms wouldn’t be able to work uniformly across containers and arrays, since they rely on end() pointing to the element one-past-the-end of any container.

    1. Thanks for your comment Kristoffer. I can’t find it in the standard right now, but I am pretty sure reading past the end of an array is undefined behaviour.

      You are of course allowed to have a pointer to something past the end of the array, and you can use it for instance to compare against another pointer (as in your example). What constitutes undefined behaviour however, is to dereference the pointer and try to use the value of the element it is pointing to (as in my example).

    2. Let’s not forget that the standard doesn’t address arrays and standard library containers (your example with end()) identically.

      I am pretty confident that reading (deferencing) a pointer outside the array bounds is undefined behavior, I just don’t have the standard reference to back up that point.

      Anyway – on the subject to *pointing* past the end of the array – people much smarter than me have discussed this topic, and even after reading everyone’s replies, **I’m still not sure if it’s undefined or not** to take the address of an array element past the end.

      http://stackoverflow.com/questions/988158

      Discuss! :-)

      1. Dan Says:
        > Discuss! :-)

        I’d rather not, actually! ;)

        But I guess we can all agree that:
        – My example does result in undefined behaviour.
        – If you are in a situation where the jury is still out on whether something is undefined or not, you’d better get out of there as quickly as possible! :)

      2. Anders is right: Using a pointer to one-past-the-end is valid, dereferencing it is not. See http://www.codeguru.com/cpp/cpp/print.php/c18603/C-Tutorial-The-Dos-and-Donts-of-Accessing-One-Element-Past-the-End-of-an-Array.htm:

        Accessing the address of one past the last element of an array is a valid operation under certain conditions. You can use that address only in pointer arithmetic expressions that access valid elements of the array, and in comparisons. You’re not allowed to dereference the result nor can you increment the pointer any further (say reaching the third element past the array’s end). Notice that STL containers follow this idiom. The end() member function returns an iterator pointing to one element past the last element of the container. You may use the iterator returned from end() only in comparisons and in expressions that access valid elements of the container:

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s