Show Me Your Signature, and I’ll Tell You Who You Are


When you write your function signatures, you have a choice between passing values, pointers or references. You might be able to make any of them work for the compiler, but what do they tell the user?

Note that even though pointers and reference are somewhat related, and mostly communicate the same thing, they have different suggestions about ownership.

Parameters

1: Pass by value

void foo(Bar b); I need to copy your object, because I need to modify it, and you don’t want to see the change. (Except for built-in types, which are usually passed by value even though they are not modified by the function.)

2: Pass by reference/pointer

void foo(Bar& b); void foo(Bar* b); I need a reference to your object, in order to modify it, beacuse you need to see the change.

3: Pass by reference to const

void foo(const Bar& b); void foo(const Bar * b); I won’t need to modify your object, and I don’t want to pay the price of a copy.

You could argue that 1 just means I will leave your object alone, and doesn’t say anything about modification. But if you aren’t going to modify it, you should use 3, so I think 1 explicitly states that the object is going to be modified (invisibly to the caller). Unless of course the argument is a built-in type.

Return values

1: Return by value

Bar foo(); Here, take this object and do whatever you want with it, I won’t touch it again. It’s yours.

2: Return a reference

Bar& foo(); There is an object over here that you can use. Someone else might silently change it, though. By the way, I own it, and it is my responsibility to delete it. (If foo() is a non-static member function, you can usually assume that won’t happen until the object on which it is called goes out of scope. If foo() is a static function, you can usually assume this won’t happen until the program exits.)

3: Return a pointer

Bar* foo(); There is an object over here that you can use. Someone else might silently change it, though. By the way, I might delete it at any time. Or maybe that’s your responsibility, and if you don’t, you’ll have a memory leak. But I won’t tell you which one it is! You need more information to be sure, for instance the documentation. Often, you can also deduce ownership from the situation. A singleton retains ownership, a factory does not.

4: Return a reference/pointer to const

const Bar& foo(); const Bar* foo(); There is an object over here that you can use, and I promise it won’t change, even if I still own it. Rules of deletion are as in 2. The reason I don’t list the ownership issues for the pointer in this case, is that I think const is an indication that ownership is retained. If it was not, why use const at all?

Why Algorithms Need two Iterators


When using algorithms in C++, such as find(), sort(), etc., you might wonder why you need to specify both the beginning and the end of the container, and not just a reference to the container itself. See for instance:

vector<int> v; //Assume it also has some elements
sort(v.begin(), v.end());

Wouldn’t it be better if we could just write

sort(v);

and let sort call begin() and end()?

If this was the case, algorithms would only work on containers that define begin() and end(). Algorithms in C++ do however work on any type that has an iterator. And an iterator is not a type, it is pure abstraction. Anything that behaves like an iterator is an iterator. If a type has a concept of the element currently pointed to, pointing to the next element and equality, it can be an iterator.

One important example is pointers. C++ algorithms work with standard arrays, because pointers behave as iterators. It would be impossible to do this:

int a[42]; //
sort(a);

How would sort get a hold of iterators (pointers) to the beginning and end of the array? The beginning is easy, but since the length of the array is lost when passed to a function, the end is not in sight. For this to work, one would have to have special versions of all algorithms, taking the size of the array as an extra argument. This would lead to a loss of generality, and double the amount of algorithms.

Passing two iterators however preserves generality, and can be used with anything that behaves like an iterator.

Note that algorithms are not defined to take a specific iterator type, like this:

void sort(iterator begin, iterator end); //All iterators inherit from iterator

Instead, they are defined like this:

template <class It>
void sort(It begin, It end); //The type of It can be whatever

First, this lets us use pointers as iterators. Pointers of course do not inherit from an iterator type. Second, this relieves us of runtime polymorphism (which could negatively impact both speed and size). If you try use a type that does not provide the necessary iterator operators, the compiler will catch it when sort() tries to use them.

This is typical for templates; with compile time polymorphism, anything that provides the interface used by the templated function is acceptable, regardless of what (if anything) it inherits from. This is often referred to as duck typing: “when I see a bird that walks like a duck and swims like a duck and quacks like a duck, I call that bird a duck.” This is also what allows you to make a vector of any type. Note that this is different from for instance Java, which uses inheritance for collections instead of duck typing. In Java, all classes inherit from Object, and you can only have collections of objects, not primitive types.

Array Initialization Initializes all Elements


I was talking about C++ with a Java programmer the other day, as he had to work on a bit of C++ code. I discovered that it isn’t necessarily obvious that all the elements in an array are constructed when the array is initialized. That is however always the case.

For arrays of built-in types, the values aren’t actually set to a specific default (except for non-local and static arrays), and will end up having arbitrary values.

For arrays of a user-defined type, the default constructor is used for all the array elements. If you don’t want this behaviour, you need to use an array of pointers. All the elements will still be initialized, but the elements being initialized are now the pointers, not the actual objects. It is worth mentioning that pointers count as a built in type, and elements of non-local and static arrays will have undefined values. In particular, they are not initialized to point to NULL.

The reason why all the elements are initialized is that there is no such thing as a “null-object”. You can have a null-pointer not pointing to anything, but you cannot have an actual object that isn’t really an object, so to speak. This of course goes for Java as well, but here, an array of a user-defined type is actually an array of pointers, even though you don’t see any *s in the code. (Java uses pointer semantics for user-defined types and value semantics for built-in types.)

Here is a summary using actual C++ code:

int ints[10]; //Non-local array, all elements ==0
int func() {
    int ints[10]; //Local array, no default value is set, but all elements are fully usable ints.
    static int sints[10]; //Static local array, all elements are initialized to 0
    Foo foos[10]; //User-defined type, all elements are constructed using the default constructor Foo::Foo();
    Foo *foos[10]; //Local array, no default value is set. The pointers point all over the place.
}

If your class has no public default constructor, it is impossible to create an array of such objects:

class NoDefault {
private:
    NoDefault(); //Make default constructor private, and hence unusable
};
NoDefault nodefaults[10]; //Impossible!
NoDefault *pnodefaults[10]; //This is fine, no objects are constructed, only pointers.

Note also that you cannot have an array of references. The most obvious reason is that a reference must always refer to an object. There is no such thing as a “null-reference”. One could however imagine getting around this using an initializer list, like this:

    Foo f1, f2;
    Foo& foors[2] = {f1, f2};

Now we don’t try to initialize references that don’t refer to anything. This is still not allowed though. One reason is that you cannot point to a reference, so accessing elements in the array in a normal sense like foors[1] wouldn’t make sense. In C++, you can only have arrays of objects, and references are not objects. (Pointers are objects though.)

In summary:

  • Elements of arrays of a user-defined type are initialized using the default constructor.
  • Elements of local non-static arrays of a built-in type are not explicitly initialized, and will have arbitrary values.
  • Elements of non-local and static arrays of a built-in type are initialized to their default value (0).
  • You cannot have an array of a user-defined type without a public default constructor.
  • You can only have arrays of objects and pointers, not references.