Keeping header file dependencies to a minimum in C++ is always a good idea. There’s a great book on the subject — John Lakos’s Large Scale C++ Design — but there’s plenty of little tricks that aren’t mentioned. In this article I’m going to discuss a handy trick I’ve discovered in reducing dependencies, particularly useful for STL headers.
Take for example, this header file, which declares a class for averaging integers:
// Averager.h
#include <list>
class Averager
{
public:
Averager();
// Add a single int to the running average.
void Add(int);
// Add a list of ints to the running average.
void Add(const std::list<int>&);
// Get the current average.
int GetAverage() const;
private:
int mNumAdded; // number of ints added
int mTotal; // running total so far
};
As ever, this is a fairly contrived example. Typically you’d imagine applications using
this class via the Add(int)
method, then extract the average through GetAverage
.
For STL convenience, there’s a method accepting a std::list
of integers too.
So what’s wrong with this code? Well, in some respects, nothing. Like Lakos recommends, the
header file is self-contained — it will compile by itself as it includes the <list>
header. So how to improve this?
Well, let’s look at the include graph (courtesy of IncludeManager):
Good grief, all this for one include?
The top box is the original header, the rest is what
<list>
brings in.
What a mess! Anything wanting to call Add(const std::list<int>&)
includes "Averager.h"
and gets <list>
included.
Fair enough. But what about things that just want to use the simpler add method Add(int)
?
They don’t need <list>
but have it thrust upon them along with all its dependencies, the cost of parsing and compiling the
unused std::list
code. This is far from the C idea of “you only pay for what you use.”
The usual way to solve this kind of issue is by forward declaration, where you
just declare the class needed in-place, and don’t define it. C++ allows you to
refer to a declared (but not defined) class as long as you don’t try and find
its size, or call any functions within it. This means you don’t need the #include <list>
after all:
// NaiveAverager.h
// Forward declare std::list.
namespace std { template<typename T> class list; }
class Averager
{
//...as before...
};
But this doesn’t won’t work! std::list
— being an STL class — isn’t just templated on
the type that it holds. It also has an allocator parameter which controls how the memory needed
is allocated. This second parameter is defaulted to allocator<T>
, and forward declaring this
gets more and more complex. Not to mention, std::list
isn’t necessarily as simply defined as
even that. In Visual Studio 2005’s implementation, std::list
publicly inherits from _List_val<_Ty, _Ax>
,
and on GCC 4.2.3 it uses protected inheritance from _List_base<_Tp, _Alloc>
.
Does this mean that forward declaration isn’t possible?
Not necessarily. There’s a little trick to getting around this issue; though it’s not without
its drawbacks. If we define our own non-templated list class and use that instead, we wouldn’t
have these issues with predeclaration. But implementing our own list class isn’t easy, and
we don’t get all the benefits of std::list
’s implementation. Well, so you might think — but
how about something like this, in MyList.h
:
// MyList.h
#include <list>
template<typename T>
class MyList : public std::list<T> {};
And then:
// NewAverager.h
// Forward declare a template list.
template<class T> class MyList;
class Averager
{
public:
Averager();
// Add a single int to the running average.
void Add(int);
// Add a list of ints to the running average.
void Add(const MyList<int>&);
// Get the current average.
int GetAverage() const;
private:
int mNumAdded; // number of ints added
int mTotal; // running total so far
};
Now we have the situation where clients of the Averager
class who use the std::list
function
must include the "MyList.h"
header, and as such have to suffer the dependencies and compile
time of <list>
. However, if the client just needs the single integer Add(int)
, then they just include
"NewAverager.h"
and don’t get <list>
at all. While this isn’t the traditional façade mentioned
in the GoF’s book, the name “façade” seems too good a fit, so that’s what I call this technique.
Of course, nothing’s perfect. The main drawback of this approach is that in order to use any constructors of the base class you need to explicitly re-implement them (usually just passing parameters through to the base implementation) in your façade class.
But how much difference does this really make? Of course, this depends on your code and its other dependencies. Using IncludeManager’s project view you can see the cost of the files:
Project details for my example project.
listclient.cpp
uses the std::list
functionality,
simpleclient.cpp
just uses Add(int)
.
The “PP Tokens” column is the number of post preprocessing tokens in a file, and “PP Total Tokens”
takes into account all the #include
d files’ tokens too.
This gives a rough guide to compile-time complexity. As you can see, listclient.cpp
contains 47
tokens in itself, but the total cost of its inclusion is some 68,000 tokens. simpleclient.cpp
has around 33 tokens in itself, and its total cost is only 79 — rather fewer tokens than listclient.cpp
.
While this is a contrived example, you can get some flavour of the improvements that can be made with this
idea.
We’ve this technique to great effect in ProFactor’s code, specifically for strings and a few key object
lists (which façade std::list<Type>
). Of course, best of all it sticks to that most basic of
C tenets — you should only pay for what you actually use.
The source code used here is available in 7-Zip format. Note there’s no actual implementation
of Averager
anywhere.
Matt Godbolt is a C++ developer working in Chicago for Aquatic. Follow him on Mastodon.