A taxonomy of C++ types


You may have heard of things like fundamental types, built-in types, basic types, integral types, arithmetic types, and so on. But what do they all mean, if anything?

In this post I’ll gradually build up the hierarchy of C++ types, eventually arriving at a big tree like in the following figure. But I promise we’ll take it easy and gradually, so it all makes sense in the end.

Intentionally unreadable for now, just to show the structure

Woha, that’s a lot! Let’s start with something seemingly simple, like ints.

Integer types

There are five standard signed integer types: “signed char“, "short int“, "int“, “long int“, and “long long int“. The implementation is also allowed to define an arbitrary number of implementation-defined extended signed integer types such as GCC’s __int128.

Together, the standard and extended signed integer types are called the signed integer types. Let’s visualise this:

For each of the standard integer types, there exists a corresponding (but different) standard un-signed integer type. These are “unsigned char“, “unsigned short int“, “unsigned int“, “unsigned long int“, and “unsigned long long int“. And similarly, for each of the extended signed integer types that the implementation defines, it has to define a corresponding extended unsigned integer type. For example, GCC defines unsigned __int128 corresponding to __int128.

Together, the standard and extended unsigned integer types are called the unsigned integer types. Let’s visualise these too, and notice the correspondence to the previous figure:

If we throw in “bool” and the character types “char“, “wchar_t“, “char8_t“, “char16_t“, and “char32_t“, we have all the integral types, also known as the integer types. (Character types have some further subdivisions that are best postponed to a separate blog post.)

Note, by the way, that you can click on all figures in this blog post to expand them.

Floating-point Types

That’s it for the integral types, now for the floating point types. Luckily, this is much simpler, since there’s only “float“, “double“, and “long double“, and they’re all signed. So there are only three standard floating-point types. Then, just as for the integral types, the implementation is allowed to defined extended floating-point types. So collectively we have these floating-point types:

Arithmetic and Fundamental Types

Collectively, the integral and floating-point types form the arithmetic types. Throw in void and std::nullptr_t, and we have all the fundamental types (remember that you can click to expand figures):

This is a good time to ask “But what about built-in and basic types?”. Those are terms that get thrown around in conversations and articles every now and then, but they don’t have an official meaning and are best avoided.

It’s understandable, however, that people refer to the fundamental types as “basic”, since they are indeed very basic compared to other types. They are “just an int”, “just a float” etc. Basically uncomplicated numbers with no additional semantics.

(Note that the term “basic type” was, in fact, used accidentally in a note (but never defined) in the standard until my PR #7287, and should be gone in C++26.)

Compound Types

So if the fundamental types are the simple, basic ones, what are the rest? The rest of the types are the compound types. The word “compound” normally means “made up of two or more parts”, so it makes sense that you find classes, unions, and arrays in this category. However, you also find enums, pointers, references, and functions in this category. I’m not sure why they chose “compound” for these, but that’s the name we have for “all the other types”, or the “non-fundamental types”:

Scalar Types

Finally, I need to mention scalar types, also shown in the diagram above. Luckily for us, no new types are introduced here, scalar types is just a grouping of types we’ve already discussed. The scalar types are all the arithmetic types, plus enums, pointers, and std::nullptr_t.

The reason I bring up scalar types here is that that’s how a memory location is defined in the standard. A memory location is either a scalar type or some bit-field stuff I want to skip for this article. Memory locations are fundamental to understanding the C++ memory model and multi-threaded programming.

Further reading

When writing this post, I considered bringing up integer promotions, conversion and the usual arithmetic conversions, but decided that the post is already long enough. Shafik Yaghmour has a nice post about The Usual Arithmetic Confusions that you might want to check out.

On the Importance of Fitting in


Programming in a object oriented language can be seen as an exercise in extending the type system. And if all your code is wrapped nicely in classes and functions, what’s left is just combining those using the language. Simple, right?

Seen from this viewpoint, the importance of designing your types correctly become very important. And the best way to design them correctly, is to have them behave as much as possible as the built-in types and library types. (On a side note, this is one reason I dislike Java’s lack of operator overloading.)

As an example, say I am designing an embedded system for a car stereo. Every radio-station is stored in a RadioStation class. There is also a RadioStationContainer class that manages the radiostations. Now we need a function to add RadioStations to the container. What do we name it? What name will make a good interface for the user of this library? addRadioStation()?

I would say a much better name is push_back(). Even though you might think addRadioStation() sounds like a more intuitive name, if you are making a container, I’d argue having it behave like all other containers is more intuitive.

How about allowing people to iterate over radio stations? The iterator type will depend on the type of container RadioStationContainer is using internally. One method I’ve seen is people use something like this (oustide the RadioStationContainer class): typedef std::list<RadioStation> RSCit. This gives people a short an easy name for the iterator, right? Again I would argue you should instead make a normal typedef inside the class, so people can use the normal RadioStationContainer::iterator. If they need a shorthand, they can make their own typedef.

Here is an example of a RadioStationContainer that can be used as a normal container:

class RadioStationContainer {
public:
    //Define the normal iterator types the user will expect
    typedef list<RadioStation>::iterator iterator;
    typedef list<RadioStation>::const_iterator const_iterator;

    //Default constructor and copy constructor
    RadioStationContainer() {}
    RadioStationContainer(const RadioStationContainer& rc) {
        copy(rc.begin(), rc.end(), back_inserter(stations));
    }

    //push_back() defined with the normal container interface
    void push_back(const RadioStation& s) { stations.push_back(s); }

    //iterators for working with both const and non const RadioStationContainers
    iterator begin() { return stations.begin(); }
    iterator end() { return stations.end(); }
    const_iterator begin() const { return stations.begin(); }
    const_iterator end() const { return stations.end(); }

private:
    list<RadioStation> stations;

};

This will fit nicely with how a user of the library expects a container to behave. But there is more! This will also fit very nicely with how the Standard Template Library expects a container to behave! You have already seen an example, using copy and back_inserter in the copy constructor. But now the user is also free to use transform, for_each etc:

void doStuffWithStation(RadioStation& s);

void f(RadioStationContainer& rc) {
    for_each(rc.begin(), rc.end(), doStuffWithStation);
}

So when in doubt, always try to fit in.