Another Reason to Avoid #includes in Headers


I have already argued that you shouldn’t put all your includes in your .h files. Here is one more reason, compilation time.

Have a look at this example, where the arrows mean “includes”

file5.h apparently need access to something which is defined in file4.h, which again needs access to three other headers. Because of this, all other files that include file5.h also includes file[1-4].h.

In c++, #include "file.h" really means “copy the entire contents of file.h here before compiling”. So in this example, file[1-3] is copied into file4.h, which is then copied into file5.h, which again is copied into the three cpp files. Every file takes a bit of time to compile, and now each cpp file doesn’t only need to compile its own content, but also all of the (directly and indirectly) included headers. What happens if we add some compilation times to our diagram? The compilation time of the file itself is on the left, the compilation time including the included headers are on the right.

As we can see, this has a dramatic effect on compilation time. file6.cpp and file7.cpp just needed something from file5.h, but got an entire tree of headers which added 1.2 seconds to their compilation times. This might not sound much, but those numbers add up when the number of files is large. Also, if you’re trying to do TDD, every second counts. And in some cases, compilation times of individual headers can be a lot worse than in this example.

What if file5 didn’t really need to have #include "file4.h" in the header, but could move it to the source file instead? Then we would have this diagram:

The compilation time of file[6-7].cpp is significantly reduced.

Now let’s look at what happens if a header file is modified. Let’s say you need to make a minor update in file1.h. When that file is changed, all the files that include it need to be recompiled as well:

But if we were able to #include "file4.h" in file5.cpp instead of file5.h, only one cpp file would need to recompile:

Recompilation time: 1.7 seconds vs. 4.5 seconds. And in a large project, this would be a lot worse. So please try to move your #includes down into the cpp files whenever you can!

The Graphviz code and makefile for this blog post is available on GitHub.

If you enjoyed this post, you can subscribe to my blog, or follow me on Twitter.

What I Have Against Precompiled Headers


Precompiled headers is a technique to reduce compilation time when your #includes take a long time. Basically, you stuff all your #includes that rarely change into one single file, which you then include in all your other files. You then tell the compiler to precompile this header, so it takes a negligible amount of time to include later on.

Instead of doing:

//my_file.cpp
#include "my_file.h"
#include "3rdparty/fancy_stuff.h"
#include "boost/expensive_lib.h"
#include <string>

you do

//my_file.cpp
#include "stdafx.h" //Common name for precompiled headers in Visual Studio
#include "my_file.h"

//stdafx.h
#include "boost/expensive_lib.h"
#include "3rdparty/fancy_stuff.h"
#include <string>

Say compiling "boost/expensive_lib.h" and "3rdparty/fancy_stuff.h" takes three seconds, and you have twenty files including these, the files themselves beeing trivial to compile, taking half a second each. That means compilation will take 20*(3s + 0.5s) = 70 seconds. With precompiled headers, this takes you 20*0.5s + 3s = 13 seconds. And the next time you compile, stdafx.h is already compiled, so it only takes you 10. That’s roughly an order of magnitude quicker.

A great improvement, no?

While it improves compilation time, it certainly does not improve readability and maintainability. Personally, I think of it as a bit of an ugly hack.

There are at least three problems with this:

1: As your project evolves, stdafx.h can get quite crowded. Come refactoring-time, you want to remove #includes which are no longer in use. While this can be challenging enough in a large cpp-file, imagine having one file holding includes from tens of others. Now you don’t only have to look through the current file to see if a certain #include can be removed, you have to look through you entire project.

2: stdafx.h now has to be the first header you include, which violates best pratice for include order.

3: Precompiled headers make .cpp-files quicker to compile. But what about when a header itself depends on other expensive headers? Normally, these need to be included in the header itself, leading to long compilation time for everyone using this header. A popular technique with precompiled headers is to not include these in the header. Since all .cpp-files will have included stdafx.h before including this header, everything will be ok. In addition to the problems laid out in 2, this poses a bigger problem when this header-file is included in .cpp-files in other projects, or even other solutions. If someone now wants to include my my_header.h, it is suddenly their responsibility to include headers that my_header.h uses internally.

That about sums it up. Consider precompiled headers added to my list of “necessary evils in C++”.

If you enjoyed this post, you can subscribe to my blog, or follow me on Twitter.

Don’t Put All Includes in the .hxx


Ok, one more thing about headers, then I promise to talk about something else.

Just because a header file is called a header file, you shouldn’t include all your headers there. Often, a source file, where the definition of you functions and/or classes are, need quite a lot of headers to do its work. It might for instance need <algorithm> to do some sorting or searching, a fact your declarations don’t need to worry about:

processing.hxx

#include "MyType.hxx"

void process(MyType obj); 

processing.cc

#include "processing.hxx"
#include <algorithm>

void process(MyType obj) {
    std::find(...);
    std::sort(...);
}

You almost always need to include some headers in the .hxx, so why not put them all there? While compiling your .cc file, it doesn’t really matter, since all those files are included anyway. But when someone else includes your header, they suddenly depend on everything your implementation depends on. This also adds to compilation time for your users. So do your includes as listed above, don’t move

Use Your Head Before Using Using in Headers


Speaking of headers, did you ever think about what you are really doing when you put using namespace my_util in a header file? Sure, it might be annoying to namespace-qualify all the types in your declarations. But using namespace my_util forces all the names in my_util into the global namespace of everyone who includes your header file.

As usual, here is a small example:

my_iface.hxx:

bool check(my_types::Foo foo, my_types::Bar bar);
bool listen(my_types::Foo foo, my_types::Bar bar);
bool digest(my_types::Foo foo, my_types::Bar bar);

The repeated namespace qualifying gets annoying, so you would rather do:
my_iface.hxx:

using namespace my_types;
bool check(Foo foo, Bar bar);
bool listen(Foo foo, Bar bar);
bool digest(Foo foo, Bar bar);

This is however a bad idea. “using namespace my_types” (a using directive) exports all the types in my_types into the global namespace, which I think everyone agrees is a bad idea. You could argue that a using declaration is better, as “using my_types::Foo” only places Foo into the global namespace. I would however disagree, as a smaller sin is still a sin.

And you’re not getting away that easily, this argument is also valid for std! Yes, you need to type my_func(std::vector v) in header files.

A final trick you might try is to only use using namespace inside of your other namespaces, like so:
my_iface.hxx:

namespace my_iface {
  using namespace my_types;
  bool check(Foo foo, Bar bar);
  bool listen(Foo foo, Bar bar);
  bool digest(Foo foo, Bar bar);
}

This is still a bad idea. You are not polluting the global namespace any more, but you are polluting your own. There are at least two problems with this approach: First, someone who uses your libraries might now accidentally qualify Foo as my_iface::Foo. If you ever decide to clean up my_iface and remove the using directive, that poor guys code is going to break. Also, if this guy decides he wants to have using namespace my_iface somewhere in his code, he probably doesn’t want all your other namespaces (and std, boost or whatever) included as well.

As usual, normal politeness applies: A one-time annoyance for you is better than a repeated annoyance for all your users.

The Order of #include Directives Matter


I think most (good) programmers tend to prefer tidy code, having some habits or rules to stick to. One such rule/habit is the order of #include directives. While having a rule of thumb to stick to is good, some are better than other.

A very common one I think is to first include standard library headers (<iostream>), then third-party libraries (<boost/thread.hpp>) and finally local headers (“FizzBuzz.hpp”). While this might be the recommended way of doing it in for instance Python, it is not the best way to do it in C++. A colleague of mine just went through a lot of pain when swapping out a big library in their codebase. Why was that?

Imagine you are writing a library that depends on some other library, like the STL (hardly any library doesn’t). Imagine you are a good boy (or girl) and write the test first. You might do something like this:

#include <vector> //STL
#include <gtest/gtest.h> //ThirdParty 
#include "geometry.hpp" //In-house

#define PI 3.2 //As pr. Indiana Bill #246, 1897

TEST(TestGeometry, rotatingOrigoGivesOrigo) {
    std::vector<double> v(2,0);
    rotate(v, PI);
    EXPECT_EQ(0, v[0]) << "X got moved!";
    EXPECT_EQ(0, v[1]) << "Y got moved!";
}

and your geometry.hpp looks like this:

void rotate(std::vector<double>& v, double angle);

This all works out nicely until someone else wants to include geometry.hpp without including <vector> first, and get something like geometry.hpp:1: error: ‘vector’ is not a member of ‘std’. You have now forced all the users of your geometry library to include <vector> before including geometry.hpp.

While this can seem like a trivial example, this stuff quickly grows a lot hairier when you have a larger codebase with lots of dependencies. A header file A might depend on header B which depends on C, which declares types you have never heard of, and rightly so. And when you try to compile, you get error: ‘würkelschmeck’ is not a member of ‘std’. Or something a lot worse if templates are involved.

My suggested rule is:

  1. Local headers
  2. Third party headers
  3. STL headers

If you enjoyed this post, you can subscribe to my blog, or follow me on Twitter