How to avoid includes in headers

I have now written twice about why you should minimize the use of include in header files. As one reader on Reddit politely put it (“crap article”), it is about time I write a post about how to do that.

There is mainly two things you can do:

  1. Reduce the number of headers you include.
  2. Reduce the contents of those headers.

1: Reducing the number of includes

Why do you need includes in the header in the first place? When you compile a file that includes header A, all the names that are used in header A need to be defined. So if header A uses an name defined in header B, header A should include header B (or else anyone who includes A will have to include B as well).

What do I mean by “using” a name then? Basically everything that mentions the name, except as a pointer or reference. Some examples of using a name:

SomeClass f; //Creating an object
some_function(); //Calling a function
class Derived : public SomeClass; //Inheriting from a class

Some examples of mentioning a name but not using it (only needs access to a declaration, not the definition):

SomeClass* f1; //Declaring a variable of pointer type
SomeClass& f2; //Declaring a variable of reference type

As soon as you want to use those though, they must be completely defined:

f1->function(); //Needs definition
f2->member; //Needs definition

Also note that the new type of enum, (enum class) only needs to be declared to be used, as long as you don’t actually mention its value:

SomeEnum e; //Declaring a variable of oldschool enum type needs definition.
SomeEnumClass e; //Declaring a variable of newschool enum type does not need definition...
SomeEnumClass e = SomeEnumClass::VALUE; //...but using a value does.

So given this knowledge, how do we reduce the number of includes in headers?

The first thing to do is to make sure you only include what is necessary to get the header file to compile. If you have a class definition in class.h and the implementation in class.cpp, the latter will typically need includes that are not needed by the header, and so should be put in the cpp. A simple way to check that you don’t have any unnecessary includes is to create an empty cpp file that only includes class.h. If it compiles, all the necessary headers are there. Then you can try commenting out includes in class.h and see which ones actually need to be there. Move the rest down to class.cpp.

The next thing to do is to check whether you actually need access to the definitions, or if a declaration will do. Try commenting out one #include at a time, and checking which names the compiler complains about. Then see if you are actually using that name, or if you are merely holding a pointer or a reference to an object of the type. If so, move the include down to the .cpp file, and add a forward declaration. Here’s an example:

#include "stuff.h"

class Ohlson
    void doThings(Stuff* stuff);

The include stuff.h in not needed, as we never use stuff. Instead, we can use a forward declaration:

class Stuff; //Forward declaration instead of include

class Ohlson
    void doThings(Stuff* stuff);

In the definition of Ohlson::doThings(Stuff* stuff) however (typically in ohlson.cpp), we probably need access to the definition of Stuff and need to include it.

So one of the things that will reduce the number of includes in headers is using pointers/references instead of values. There are other considerations to design as well though, so I would not advocate moving from values to pointers without considering the full picture.

Another thing that helps is to have the definition of functions in the .cpp file instead of the .h file. This however prevents inlining, so again, use caution.

A final technique that could help is the Pimpl Idiom. Briefly explained, instead of keeping your private members in the header, you keep them in a struct in the .cpp file. This means you don’t need access to their definitions in the header file, and can move their respective includes down to the .cpp file as well.

Reducing the content of includes

The other thing you can do to reduce the negative effects of includes in headers, if you cannot avoid them, is to reduce the content of those included headers. Here are some techniques:

C++ supports several paradigms, but a good recommendation for most is to depend on abstractions instead of implementations. In object oriented programming, this is done by programming to interfaces. C++ doesn’t have an explicit interface concept, but a class with only pure virtual methods is the same thing. Interfaces should also be kept small and focused. This means an interface should be very quick to compile, reducing the pain of having it included in many files. You will however still need to recompile all the files that include it when it changes, but the more focused it is, the fewer files will depend on it.

A generalization of the above is to avoid large inheritance hierarchies, even if you aren’t able to only use pure interfaces.

If you absolutely need to include a file in your header, but it is one that you own, you can reduce the content of that file by applying the techniques mentioned in the first part of this article to that file. (This is just section 1 seen from the other side of the table.)

A less straightforward technique is to be a bit clever with your templates. Templates are usually completely defined in header files, meaning that a change in the implementation triggers a recompile of everything that uses that template. There is however ways to move the implementation of templates out of header files, for instance as in this Dr. Dobbs article. C++11’s extern template can also help reduce compilation time. Avoiding the use of templates completely of course also solves the problem, but there was probably a reason why you wanted them in the first place.

One final technique that cannot go unmentioned is precompiled headers. While I don’t like what they do to dependencies, I consider them a necessary evil in medium to large projects to keep compile-times reasonable. Precompiled headers are very well explained in the first link in this paragraph, but here’s a short version: Headers can take a long time to compile. If they also change rarely, it would be nice to be able to cache them, and this it what precompiled headers does. In Visual Studio for instance, you put these includes in a file conventionally called stdafx.h, and then include stdafx.h in all your cpp files instead of the actual headers you need. The compiler then only has to compile them once. Headers from the standard library are typically placed here, and third party headers are also usually a good fit.

Those are the techniques I could think of, but I am sure I missed some. Which are you favourite ones? Which did I miss? Which of them are silly? Please use the comment section below.

If you enjoyed this post, you can subscribe to my blog, or follow me on Twitter.

Another Reason to Avoid #includes in Headers

I have already argued that you shouldn’t put all your includes in your .h files. Here is one more reason, compilation time.

Have a look at this example, where the arrows mean “includes”

file5.h apparently need access to something which is defined in file4.h, which again needs access to three other headers. Because of this, all other files that include file5.h also includes file[1-4].h.

In c++, #include "file.h" really means “copy the entire contents of file.h here before compiling”. So in this example, file[1-3] is copied into file4.h, which is then copied into file5.h, which again is copied into the three cpp files. Every file takes a bit of time to compile, and now each cpp file doesn’t only need to compile its own content, but also all of the (directly and indirectly) included headers. What happens if we add some compilation times to our diagram? The compilation time of the file itself is on the left, the compilation time including the included headers are on the right.

As we can see, this has a dramatic effect on compilation time. file6.cpp and file7.cpp just needed something from file5.h, but got an entire tree of headers which added 1.2 seconds to their compilation times. This might not sound much, but those numbers add up when the number of files is large. Also, if you’re trying to do TDD, every second counts. And in some cases, compilation times of individual headers can be a lot worse than in this example.

What if file5 didn’t really need to have #include "file4.h" in the header, but could move it to the source file instead? Then we would have this diagram:

The compilation time of file[6-7].cpp is significantly reduced.

Now let’s look at what happens if a header file is modified. Let’s say you need to make a minor update in file1.h. When that file is changed, all the files that include it need to be recompiled as well:

But if we were able to #include "file4.h" in file5.cpp instead of file5.h, only one cpp file would need to recompile:

Recompilation time: 1.7 seconds vs. 4.5 seconds. And in a large project, this would be a lot worse. So please try to move your #includes down into the cpp files whenever you can!

The Graphviz code and makefile for this blog post is available on GitHub.

If you enjoyed this post, you can subscribe to my blog, or follow me on Twitter.

What I Have Against Precompiled Headers

Precompiled headers is a technique to reduce compilation time when your #includes take a long time. Basically, you stuff all your #includes that rarely change into one single file, which you then include in all your other files. You then tell the compiler to precompile this header, so it takes a negligible amount of time to include later on.

Instead of doing:

#include "my_file.h"
#include "3rdparty/fancy_stuff.h"
#include "boost/expensive_lib.h"
#include <string>

you do

#include "stdafx.h" //Common name for precompiled headers in Visual Studio
#include "my_file.h"

#include "boost/expensive_lib.h"
#include "3rdparty/fancy_stuff.h"
#include <string>

Say compiling "boost/expensive_lib.h" and "3rdparty/fancy_stuff.h" takes three seconds, and you have twenty files including these, the files themselves beeing trivial to compile, taking half a second each. That means compilation will take 20*(3s + 0.5s) = 70 seconds. With precompiled headers, this takes you 20*0.5s + 3s = 13 seconds. And the next time you compile, stdafx.h is already compiled, so it only takes you 10. That’s roughly an order of magnitude quicker.

A great improvement, no?

While it improves compilation time, it certainly does not improve readability and maintainability. Personally, I think of it as a bit of an ugly hack.

There are at least three problems with this:

1: As your project evolves, stdafx.h can get quite crowded. Come refactoring-time, you want to remove #includes which are no longer in use. While this can be challenging enough in a large cpp-file, imagine having one file holding includes from tens of others. Now you don’t only have to look through the current file to see if a certain #include can be removed, you have to look through you entire project.

2: stdafx.h now has to be the first header you include, which violates best pratice for include order.

3: Precompiled headers make .cpp-files quicker to compile. But what about when a header itself depends on other expensive headers? Normally, these need to be included in the header itself, leading to long compilation time for everyone using this header. A popular technique with precompiled headers is to not include these in the header. Since all .cpp-files will have included stdafx.h before including this header, everything will be ok. In addition to the problems laid out in 2, this poses a bigger problem when this header-file is included in .cpp-files in other projects, or even other solutions. If someone now wants to include my my_header.h, it is suddenly their responsibility to include headers that my_header.h uses internally.

That about sums it up. Consider precompiled headers added to my list of “necessary evils in C++”.

If you enjoyed this post, you can subscribe to my blog, or follow me on Twitter.