Raw pointers in modern C++

Saturday, October 1, 2016

- Hi. May I have 3 owning raw pointers and a couple of delete keywords, please?
- No, sir. Not anymore. Not in 2016.

The rule

There is almost no need to use owning raw pointers and delete keyword in today’s C++. It is highly unlikely that your situation is exceptional unless you’re working on a brand new, super cool, memory manager library, or another very specific low-level tool.

Owning vs non-owning

The word ‘owning’ is important. All problems described below are related to owning pointers. There is nothing wrong with non-owning raw pointers.

void foo(const SomeClass* const p1) {
    SomeClass* p2 = new SomeClass();
    // p1 - non-owning, p2 - owning
    // ...
}

What’s wrong with raw pointers

The raw pointer was an amazing invention in its time. They simplified development and pushed our industry forward. However, that time has passed and today, they generate a lot of potential problems. Some of them are shown below. Let’s take a look at this simple snippet. Do you see the mistake?

void use_resource(const int n) {
    auto buff = new int[n * 1024];
    // do some operation with the buff
    delete buff;
}

First, we have a well-known memory leak problem. If the exception happens between allocation and deallocation, then memory owned by buff will not be released. And even if you never use exceptions and you’re sure that all people who will work with your code do the same, there is still a danger. For example, in time someone may write a validation line and jump out of the method before deallocation happens:

if (n > 1000) { // some validation
    return;
}

Another problem with the use_resource() snippet above is that I accidentally used delete instead of delete[] (we are people and people make mistakes sometimes, right?). And, guess what? Memory leak? It’s actually worse - we got undefined behaviour.
Now let’s try to be more defensive and wrap buff allocation and de-allocation into a class:

class Resource {
public:
    Resource() {
        buff_ = new int[10];
        for (int i = 0; i < n; ++i) {
            buff_[i] = i;
        }
    }

    void use() {
        // do some operation with the resource, print second element for example
        printf("%d\n", buff_[1]);
    }

    ~Resource() {
        delete[] buff_;
    }

private:
    int* buff_
}

It’s better, right? Now, we can write:

void use_resource(const int n) {
    Resource buff;
    // do some operation with the buff
} // here buff destructor will be called automatically

The code became simpler and we don’t need to worry about memory leaks because the destructor (which will be called automatically when buff goes out of scope) will take care of everything. Unfortunately, this code still has problems. Imagine the following use case:

void add_resource(std::vector<Resource>& resources, bool save) {
    Resource new_resource;
    // do something with new resource
    new_resource.use();
    if (save) { // save it for future usage
        resources.push_back(new_resource);
    }
}

int main() {
    std::vector<Resource> resources;
    // create 2 resources and save both of them to the vector
    add_resource(resources, true);
    add_resource(resources, true);
    for (auto& r : resources) {
        r.use();
    }
}

Here, when we push resources to the vector, a copy of our object is created. And, oops, I forgot to create a copy constructor, so the compiler generated one for me. The generated constructor performs a so-called shallow copy, which means it copies only the pointer and not the data. We have two objects that point to the same data. As soon as one of them goes out of scope (the end of add_resource() method), the data is removed. Our resources vector will store dangling pointers. So line r.use_resource() leads to undefined behavior. If you’re lucky your program will crash, if not… well, anything can happen.

Raw pointers have a lot of other disadvantages. See Scott Meyers’s book for details or do your own research.

Why do people still use it?

Legacy code is a separate topic. C++ is an old language and there are tons of projects written in it, so obviously they cannot be refactored quickly. Nevertheless, why are owning raw pointers still used so heavily in modern C++ in recent projects? The only answer I have is “bad habit”. There were multiple talks, books, blogs (see below), but not everyone unlearned the old stuff and learned the better way.

So… What to use instead?

In short - use RAII idiom

  • Allocate object in the stack if possible
  • Use smart pointers
  • Only as a last step - wrap your resource into its own class, where you acquire in constructor and release in destructor, but do not forget about copy/move constructor and assignment operators, which may be generated for you

Still not convinced? Take a look at amazing text/videos from C++ experts:

cpp

A safe approach to project setup