Disempower Every Variable


In which I argue you should reduce the circle of influence, and the ability to change, of every variable.

The more a variable can do, the harder it is to reason about. If you want to change a single line of code involving the variable, you need to understand all its other uses. To make your code more readable and maintainable, you should disempower all your variables as much as possible.

Here are two things you can do to minimize the power of a variable:

1: Reduce its circle of influence (minimize the scope)
I once had to make a bugfix in a 400 line function, containing tens of for-loops. They all reused a single counter variable:

{
  int i;
  (...)
  for (i = 0; i < n; ++i) {
  }
  (...)
  for (i = 0; i < n; ++i) {
  }
  //350 lines later...
  for (i = 0; i < n; ++i) {
  }
}

When looking at a single for-loop, how am I to know that the value of i is not used after the specific loop I was working on? Someone might be doing something like

for (i = 0; i < n; ++i) {
}
some_array[i] = 23

or

for (i = 0; i < n; ++i) {
}
for (; i < m; ++i) {
}

The solution here is of course to use a local variable to each for-loop (unless of course it actually is used outside of the loop):

for (int i = 0; i < n; ++i) {
}
for (int i = 0; i < n; ++i) {
}

Now I can be sure that if I change i in one for-loop, it won’t affect the rest of the function.

2: Take away its ability to change (make it const)

(I have blogged about const a few times before. It is almost always a good idea to make everything that doesn’t need to change const.)

Making a local variable const helps the reader to reason about the variable, since he will instantly know that its value will never change:

void foo() {
  const string key = getCurrentKey();
  (...) //Later...
  doSomethingWith(key);
  (...) //Even later...
  collection.getItem(key).process();

Here the reader knows that we are always working with the same key throughout foo().

In summary: Reduce the circle of influence (by reducing the scope) and take away the ability to change (by using const).

If you enjoyed this post, you can subscribe to my blog, or follow me on Twitter.

Undefined Behaviour — Worse Than its Reputation?


Last week I wrote about The Difference Between Unspecified and Undefined Behaviour. This week I’d like to expand a bit more on the severity of undefined behaviour. If however you have a lot of time, instead go read A Guide to Undefined Behavior in C and C++ by John Regehr of the University of Utah, and then What Every C Programmer Should Know About Undefined Behavior by Chris Lattner of the LLVM project, as they cover this material in much more depth (and a lot more words!) than I do here.

To expand on the example from last week, what is the output of this program?

int main()
{
    int array[] = {1,2,3};
    cout << array[3] << endl;
    cout << "Goodbye, cruel world!" << endl;
}

A good guess would be a random integer on one line, then “Goodbye, cruel world!” on another line. A better guess would be that anything can happen on the first line, but then “Goodbye, cruel world!” for sure is printed. The answer is however that we can’t even know that, since If any step in a program’s execution has undefined behavior, then the entire execution is without meaning. [Regehr p.1].

This fact has two implications that I want to emphasize:

1: An optimizing compiler can move the undefined operation to a different place than it is given in the source code
[Regehr p.3] gives a good example of this:

int a;

void foo (unsigned y, unsigned z)
{
  bar();
  a = y%z; //Possible divide by zero
}

What happens if we call foo(1,0)? You would think bar() gets called, and then the program crashes. The compiler is however allowed to reorder the two lines in foo(), and [Regehr p.3] indeed shows that Clang does exactly this.

What are the implications? If you are investigating a crash in your program and never see the results of bar(), you might falsely conclude that the bug in the sourcecode must be before bar() is called, or in its very beginning. To find the real bug in this case you would have to turn off optimization, or step through the program in a debugger.

2: Seemingly unrelated code can be optimized away near a possible undefined behaviour
[Lattner p.1] presents a good example:

void contains_null_check(int *P) {
  int dead = *P;
  if (P == 0)
    return;
  *P = 4;
}

What happens if P is NULL? Maybe some garbage gets stored in int dead? Maybe dereferencing P crashes the program? At least we can be sure that we will never reach the last line, *P = 4 because of the check if (P == 0). Or can we?

An optimizing compiler applies its optimizations in series, not in one omniscient operation. Imagine two optimizations acting on this code, “Redundant Null Check Elimination” and “Dead Code Elimination” (in that order).

During Redundant Null Check Elimination, the compiler figures that if P == NULL, then int dead = *P; results in undefined behaviour, and the entire execution is undefined. The compiler can basically do whatever it wants. If P != NULL however, there is no need for the if-check. So it safley optimizes it away:

void contains_null_check(int *P) {
  int dead = *P;
  //if (P == 0)
    //return;
  *P = 4;
}

During Dead Code Elimination, the compiler figures out that dead is never used, and optimizes that line away as well. This invalidates the assumption made by Redundant Null Check Elimination, but the compiler has no way of knowing this, and we end up with this:

void contains_null_check(int *P) {
  *P = 4;
}

When we wrote this piece of code, we were sure (or so we thought) that *P = 4 would never be reached when P == NULL, but the compiler (correctly) optimized away the guard we meticulously had put in place.

Concluding notes
If you thought undefined behaviour only affected the operation in which it appears, I hope I have convinced you otherwise. And if you found the topic interesting, I really recommend reading the two articles I mentioned in the beginning (A Guide to Undefined Behavior in C and C++ and What Every C Programmer Should Know About Undefined Behavior). And the morale of the story is of course to avoid undefined behaviour like the plague.

If you enjoyed this post, you can subscribe to my blog, or follow me on Twitter.