A String Literal is not a string


The other day, I discovered a bug in some code that can be simplified to something like this:

A library of functions that handles a lot of different types:

void doSomethingWith(int i)           { cout << "int"    << endl; };
void doSomethingWith(double d)        { cout << "double" << endl; };
void doSomethingWith(const string& s) { cout << "string" << endl; };
void doSomethingWith(const MyType& m) { cout << "MyType" << endl; };

Used like this:

    doSomethingWith(3);
    doSomethingWith("foo");

It of course outputs:

int
string

Then someone wanted to handle void pointers as well, and added this function to the library:

void doSomethingWith(const void* i) { cout << "void*" << endl; };

What is the output now? Make up your mind before looking.

int
void*

What happened? Why did C++ decide to use the const void * function instead of const string& that we wanted it to use?

The type of a string literal is not string, but const char[]. When deciding on the overloaded function to use, C++ will first see if any of them can be used directly. A const void* is the only type in our example than can point directly to the const char[], so that one is picked.

Before that function was introduced, none of the functions could be used directly, as neither const string& nor const MyType& can refer to a const char[], and it cannot be cast to an int or a double. C++ then looked for implicit constructors that could convert the const char[] into a usable type, and found std::string::string(const char * s). It then went on to create a temporary std::string object, and passed a reference to this object to void doSomethingWith(const string& s), like this:

doSomethingWith(std::string("foo"))

But then, when the const void* version appeared as an alternative, it preferred to use that one instead as it could be used without constructing any temporary objects.

As usual, the code for this blog post is available on GitHub.

If you enjoyed this post, you can subscribe to my blog, or follow me on Twitter.

Why we should see an uptake in <algorithm> usage


With C++11 out, I think we should see an uptake in use of the good old std <algorithm>. Why?

A common thing to do in a program is to iterate over a container of objects, producing another container of other objects. Imagine for instance you have a vector of domain objects:

struct DomainObject
{
    string label;
};
vector<DomainObject> objects;

Now you want to produce a vector containing the labels of all your domain objects. This is the “classical” solution:

    vector<string> labels(objects.size());
    for (size_t i = 0; i < objects.size(); ++i)
        labels[i] = objects[i].label;


You can however instead use std::transform, which is more declarative, immune to Off-by-one errors, possibly more optimization friendly etc. This is how it looks:

    vector<string> labels(objects.size());
    transform(objects.begin(), objects.end(), labels.begin(), label_for);


The problem is however that you need a function / function object to provide as the last argument to transform. Here is the one I used:

string label_for(const DomainObject& obj)
{
    return obj.label;
}


This reduces locality, and makes the code harder to read. Unless the helper is sufficiently advanced that you would want to either reuse it a lot or test it, it would be better to be able to write it directly in the transform call. This is exactly what C++11 lambdas are good for, and where I’ll think we’ll see them used a lot:

    vector<string> labels(objects.size());
    transform(objects.begin(), objects.end(), labels.begin(), [](const DomainObject& o){return o.label;});


This isn’t a complete introduction to lambdas, but if you haven’t seen them before, here is a quick intro. Lambdas are just a fancy name for functions without a name. That means you can simply type them in directly where you’d normally call a function. [] means “anonymous function follows” (at least for the purposes of this article), and then you just type out any normal function body. Mine takes a reference to a DomainObject and returns its label, just like label_for() did.

Here is another example, using std::find_if to look for a specific element in a container:

    auto matched = find_if(objects.begin(), objects.end(), [](const DomainObject& o) { return o.label == "two"; });
    cout << matched->label << endl;


Notice the use of auto, another C++11 feature. It uses type inference to deduce the type of the variable by looking at the rest of the expression. Here it understands that you will be getting a vector<DomainObject>::iterator from find_if(), so there is no need for you to type that out.

As usual, the code for this blog post is available on GitHub.

If you enjoyed this post, you can subscribe to my blog, or follow me on Twitter.