Please Make Member Functions Const Whenever Possible


When you write a member function that doesn’t modify the object it operates on, please make sure to make it const.

class Whisky {
public:
  Smell smellIt() const;
  Tase tasteIt(int ml);
};

Smelling the whisky doesn’t modify it, so I made smellIt() const. Tasting does however modify it (there is less left in the glass), so tasteIt() is not const.

Why does this matter? If someone makes a const object of the class, they might still want to call some methods on it. For instance, a good practice for a method is to take objects by reference to const. And then you will only be able to call const methods:

void describeWhisky(const Whisky& whisky) {
  std::cout << whisky.smellIt();
}

This would not be possible if smellIt() was not const. The same goes for using const_iterators and a lot of other situations, so please remember to add const whenever you can, even though it doesn’t make a differece to you then and there.

Edit: I posted a follow up, on how you can change some parts of an object, even if it is const.

Static and Dynamic Polymorphism – Siblings That Don’t Play Well Together


Static and dynamic polymorphism might sound related, but do not play well together, a fact I had to deal with l this week.

This is the situation: We have a set of worker classes, that work on a set of data classes. The worker classes can do different types of work, but the type of work is always known at compile time, so they were created using static polymorphism, aka templates.

template &lt;typenmame WorkImpl&gt;
class Worker {
...
};

WorkImpl decides the details of the work, but the algorithm is defined in Worker, and is always the same.

Those are the workers. We then have a pretty big tree of data classes. What kind of objects we need to deal with is not known until runtime, so we are using inheritance and virtual methods for these.

  A
 / \
B   C
|
D 

etc.

A has a virtual method fill() that gets passed a WorkImpl object, and fills it with data. fill() needs to be implemented in all the subclasses of A, to fill it with their special data. Bascially, what we want is to say

class A { 
  template  &lt;typename WorkImpl&gt;
  virtual fill(WorkImpl&amp; w) {...} //overridden in B, C, D etc.
};

Can you spot the problem?

Hint: When will the templated fill() methods be instantiated?

There is no way for the compiler to know that classes A-D will ever have their fill()-methods called, thus the fill()-methods will never be instantiated!

One solution to this problem is to make one version of fill() per type of WorkImpl, in each data class A-D. The body of a fill() method is independent of the type of WorkImpl, so this solution would lead to a lot of unnecessary duplication.

The solution we ended up with was creating wrapper methods that take a specific WorkImpl, and just calls fill() with that argument:

class A {
  virtual fillWrapper(OneWorkImpl&amp; w) { fill(w) };
  virtual fillWrapper(AnotherWorkImpl&amp; w) { fill(w) };
  template &lt;typename WorkImpl&gt;
  virtual fill(WorkImpl&amp; w) {...}
}

In this way we minimize code duplication, and still make sure the fill() methods are always instantiated.

Question: Can we get away with defining the wrappers in A only? And if not, when will it fail?

Answer: We can not. If we only define the wrappers in A, fill() will never be instantiated in B-D. Everything compiles and links, but at runtime, it is the A version that will be called for all the classes, leading to a wrong result.

Finally, I should mention that another option would have been to refactor the worker classes to use dynamic polymorphism, which would basically solve everything, and allow us to get rid of the wrapper functions. But that would have been a bigger job, which we didn’t have time for.

Book Review: Effective C++ by Scott Meyers


A while ago, I read Effective C++ by Scott Meyers, and I would like to recommend it.

You can read tutorials, play with the language, maybe read The C++ Programming Language by Stroustrup himself, and be able to write fully functioning C++ programs that pass all the tests, and seem to work. But still, they can have more or less subtle problems, which might manifest as random crashes, resource leaks, scalability problems, or problems with reusability and maintenance. In general, correct C++, but still wrong.

Instead of discovering all of this yourself, you can invest some time in reading best practices books. This of course goes for all languages and processes, but I think it is especially important for C++, being such a complex language, and being one of the few remaining big languages lacking a garbage collector.

Effective C++ gives you, as its full title promises, 55 Specific Ways to Improve Your Programs and Designs. These are further grouped into nine chapters that deal with related topics. Each item ranges from one page for the simple ones, to ten pages for the more involved. They all include, and tend to start with, example code, followed by a discussion of what is wrong with it, and conclude with a better way. I find this to be a good pedagogical approach, and it is indeed one I strive to follow on this blog.

Meyers start out with the simple stuff, such as recommending the use of const whenever possible. This is is simple tip by the sound of it, but he actually uses a full nine pages to cover the topic, const being such a versatile keyword. He moves on to other well known tips like preventing exceptions from leaving destructors (which I also covered briefly while elaborating on function try blocks) and keeping data members private, but also touches on more complex topics, like object oriented design in C++, the proper use of overloaded operators and keywords, and generic programming.

For a full list of chapters and items, please use the magic of the fabulous internet.

The book has four and a half stars on Amazon, which I definitely think it deserves. You do however need some experience with C++ to appreciate it, and indeed to understand some of it. I would not recommend it as follow up to a short university course about C++, or after reading an introductory text to the language, but if you have been programming C++ actively for some time, it is a rewarding read. Even if you have written C++ for many years, I am sure there is new stuff to learn in there.

If you enjoyed this post, you can subscribe to my blog, or follow me on Twitter.

Don’t Put All Includes in the .hxx


Ok, one more thing about headers, then I promise to talk about something else.

Just because a header file is called a header file, you shouldn’t include all your headers there. Often, a source file, where the definition of you functions and/or classes are, need quite a lot of headers to do its work. It might for instance need <algorithm> to do some sorting or searching, a fact your declarations don’t need to worry about:

processing.hxx

#include "MyType.hxx"

void process(MyType obj); 

processing.cc

#include "processing.hxx"
#include <algorithm>

void process(MyType obj) {
    std::find(...);
    std::sort(...);
}

You almost always need to include some headers in the .hxx, so why not put them all there? While compiling your .cc file, it doesn’t really matter, since all those files are included anyway. But when someone else includes your header, they suddenly depend on everything your implementation depends on. This also adds to compilation time for your users. So do your includes as listed above, don’t move

Use Your Head Before Using Using in Headers


Speaking of headers, did you ever think about what you are really doing when you put using namespace my_util in a header file? Sure, it might be annoying to namespace-qualify all the types in your declarations. But using namespace my_util forces all the names in my_util into the global namespace of everyone who includes your header file.

As usual, here is a small example:

my_iface.hxx:

bool check(my_types::Foo foo, my_types::Bar bar);
bool listen(my_types::Foo foo, my_types::Bar bar);
bool digest(my_types::Foo foo, my_types::Bar bar);

The repeated namespace qualifying gets annoying, so you would rather do:
my_iface.hxx:

using namespace my_types;
bool check(Foo foo, Bar bar);
bool listen(Foo foo, Bar bar);
bool digest(Foo foo, Bar bar);

This is however a bad idea. “using namespace my_types” (a using directive) exports all the types in my_types into the global namespace, which I think everyone agrees is a bad idea. You could argue that a using declaration is better, as “using my_types::Foo” only places Foo into the global namespace. I would however disagree, as a smaller sin is still a sin.

And you’re not getting away that easily, this argument is also valid for std! Yes, you need to type my_func(std::vector v) in header files.

A final trick you might try is to only use using namespace inside of your other namespaces, like so:
my_iface.hxx:

namespace my_iface {
  using namespace my_types;
  bool check(Foo foo, Bar bar);
  bool listen(Foo foo, Bar bar);
  bool digest(Foo foo, Bar bar);
}

This is still a bad idea. You are not polluting the global namespace any more, but you are polluting your own. There are at least two problems with this approach: First, someone who uses your libraries might now accidentally qualify Foo as my_iface::Foo. If you ever decide to clean up my_iface and remove the using directive, that poor guys code is going to break. Also, if this guy decides he wants to have using namespace my_iface somewhere in his code, he probably doesn’t want all your other namespaces (and std, boost or whatever) included as well.

As usual, normal politeness applies: A one-time annoyance for you is better than a repeated annoyance for all your users.

The Order of #include Directives Matter


I think most (good) programmers tend to prefer tidy code, having some habits or rules to stick to. One such rule/habit is the order of #include directives. While having a rule of thumb to stick to is good, some are better than other.

A very common one I think is to first include standard library headers (<iostream>), then third-party libraries (<boost/thread.hpp>) and finally local headers (“FizzBuzz.hpp”). While this might be the recommended way of doing it in for instance Python, it is not the best way to do it in C++. A colleague of mine just went through a lot of pain when swapping out a big library in their codebase. Why was that?

Imagine you are writing a library that depends on some other library, like the STL (hardly any library doesn’t). Imagine you are a good boy (or girl) and write the test first. You might do something like this:

#include <vector> //STL
#include <gtest/gtest.h> //ThirdParty 
#include "geometry.hpp" //In-house

#define PI 3.2 //As pr. Indiana Bill #246, 1897

TEST(TestGeometry, rotatingOrigoGivesOrigo) {
    std::vector<double> v(2,0);
    rotate(v, PI);
    EXPECT_EQ(0, v[0]) << "X got moved!";
    EXPECT_EQ(0, v[1]) << "Y got moved!";
}

and your geometry.hpp looks like this:

void rotate(std::vector<double>& v, double angle);

This all works out nicely until someone else wants to include geometry.hpp without including <vector> first, and get something like geometry.hpp:1: error: ‘vector’ is not a member of ‘std’. You have now forced all the users of your geometry library to include <vector> before including geometry.hpp.

While this can seem like a trivial example, this stuff quickly grows a lot hairier when you have a larger codebase with lots of dependencies. A header file A might depend on header B which depends on C, which declares types you have never heard of, and rightly so. And when you try to compile, you get error: ‘würkelschmeck’ is not a member of ‘std’. Or something a lot worse if templates are involved.

My suggested rule is:

  1. Local headers
  2. Third party headers
  3. STL headers

If you enjoyed this post, you can subscribe to my blog, or follow me on Twitter

The Simplest Possible Introduction to Traits


This is the simplest introduction to traits I could think of:

A trait is information about a type stored outside of the type.

Imagine you are making a templated algorithm, and want to use different policies for different types. If you wrote those types, you could implement the policy selection in your types directly, but this is generally not possible for built in types and types you haven’t created yourself. Traits to the rescue!

Given

template <class T>
void f(T var) {
    bool advanced = policy_trait<T>::advanced;
    if ( advanced )
        cout << "Do it fancy!" << endl;
    else
        cout << "KISS." << endl;
}

you can define policies for whatever type you want by creating template specializations of policy_trait

template <>
struct policy_trait<char> {
    static const bool advanced = false;
};

That’s it! :)

(The original paper on traits is Traits: a new and useful template technique by Nathan C. Myers, but An introduction to C++ Traits by Thaddaeus Frogley is probably a more intuitive introduction.)

Why You Shouldn’t Throw in Destructors


In my post two weeks ago, What’s the Point of Function Try Blocks, I mentioned that you shouldn’t throw in destructors. Why exactly is this?

Imagine the following:

void f() {
  MyClass c;
  functionThatMightThrow();
}

If functionThatMightThrow throws, stack unwinding occurs. What this means is basically that all the objects on the stack get destructed, from the innermost block and out. In this case, c is destroyed before the exception is thrown out of f.

But what happens if ~MyClass() throws? There would now be two simultaneously active exceptions, something that luckily is forbidden by the standard. Instead, terminate() is called, and the whole program stops.

So what can you do? Design around it. Log and swallow. Ignore. Abort. If your class is managing some resource that should be cleaned up, but failing to do so is not critical, you could also provide a close() method that throws on error. A client who feels it is necessary to be 100% sure the resource is cleaned up will then have the option of calling close() and handling the exception himself, before your destructor is called.

Note that relying on the client to remember to call close(), setUp(), init() etc. is generally a bad idea, but might be acceptable in some cases. What makes it acceptable in this case is if cleaning up the resource is optional.

Why Algorithms Need two Iterators


When using algorithms in C++, such as find(), sort(), etc., you might wonder why you need to specify both the beginning and the end of the container, and not just a reference to the container itself. See for instance:

vector<int> v; //Assume it also has some elements
sort(v.begin(), v.end());

Wouldn’t it be better if we could just write

sort(v);

and let sort call begin() and end()?

If this was the case, algorithms would only work on containers that define begin() and end(). Algorithms in C++ do however work on any type that has an iterator. And an iterator is not a type, it is pure abstraction. Anything that behaves like an iterator is an iterator. If a type has a concept of the element currently pointed to, pointing to the next element and equality, it can be an iterator.

One important example is pointers. C++ algorithms work with standard arrays, because pointers behave as iterators. It would be impossible to do this:

int a[42]; //
sort(a);

How would sort get a hold of iterators (pointers) to the beginning and end of the array? The beginning is easy, but since the length of the array is lost when passed to a function, the end is not in sight. For this to work, one would have to have special versions of all algorithms, taking the size of the array as an extra argument. This would lead to a loss of generality, and double the amount of algorithms.

Passing two iterators however preserves generality, and can be used with anything that behaves like an iterator.

Note that algorithms are not defined to take a specific iterator type, like this:

void sort(iterator begin, iterator end); //All iterators inherit from iterator

Instead, they are defined like this:

template <class It>
void sort(It begin, It end); //The type of It can be whatever

First, this lets us use pointers as iterators. Pointers of course do not inherit from an iterator type. Second, this relieves us of runtime polymorphism (which could negatively impact both speed and size). If you try use a type that does not provide the necessary iterator operators, the compiler will catch it when sort() tries to use them.

This is typical for templates; with compile time polymorphism, anything that provides the interface used by the templated function is acceptable, regardless of what (if anything) it inherits from. This is often referred to as duck typing: “when I see a bird that walks like a duck and swims like a duck and quacks like a duck, I call that bird a duck.” This is also what allows you to make a vector of any type. Note that this is different from for instance Java, which uses inheritance for collections instead of duck typing. In Java, all classes inherit from Object, and you can only have collections of objects, not primitive types.

What’s the Point of Function Try Blocks


This post is a follow up to The Returning Function that Never Returned, which I wrote a couple of months ago. You can read it first if you want, it is only around a hundred words, but this post can be read on its own as well.

In my previous post, I presented a function with try outside of the braces. This is called a “function try block”. The example went like this:

std::string foo() try {
    bar();
    return "foo";
} catch (...) {
    log("Unable to foo!")
}

This is completely useless, even dangerous. The try should be moved inside the function itself, like this:

std::string foo() {
    try {
        bar();
        return "foo";
    } catch (...) {
        log("Unable to foo!")
        //either rethrow or throw something else
    }
    //or make sure something is returned
}

So why were function try blocks introduced to the language in the first place? Have a look at this:

class A : public B{
    A(int x) : c(x){}
    C c;
}

How do you catch an exception thrown by Cs constructor? Having try inside the function block is too late. Initializing c inside the function block is not a solution either, since c will always be initialized before the body of the constructor, even if it is not mentioned in the initializer list. The only way to catch such an exception is to use a function try block on the constructor, and that is also why they were introduced. (Note that this argument is also valid if Bs constructor might throw.)

This is also the only sane case in which to use them. The other to candidates are regular functions and destructors, both of which I will get back to in a moment.

First, let’s have a look at what you can do with a function try block on a constructor. When the catch block reaches its end, it will rethrow the exception. It is impossible to swallow it. If you could swallow the exception, the code that tried to construct an object of this type would have no way of knowing that construction failed. A failed construction means the object doesn’t exist, and it doesn’t make sense to continue pretending nothing has happened. The only thing you can do is to throw an exception of another type, or cause a side effect (such as logging) and then retrhow. In particular, you cannot try to recover from the problem.

The same goes for destructors, you cannot swallow the exception. And since you really should avoid having destructors throw, using function try blocks on destructors is a bad idea.

Try blocks on regular functions behave a bit differently; if the end of the catch block is reached, the function will automatically return. But if you have a non-void function, this doesn’t make sense at all, as I mentioned in my previous post.

So in conclusion:

  1. Only use function try blocks for constructors.
  2. Don’t try to do anything else than rethrowing (possibly another type) or cause side effects like logging.

For a more in-depth discussion of this, have a look at Sutter’s Mill: Constructor Failures (or, The Objects That Never Were) by Herb Sutter. If you would like a recap of exception handling and constructor initializers thrown in, I recommend to start with Introduction to Function Try Blocks by Alan Nash.