A safe approach to project setup

Sunday, January 8, 2017

A ship will sail the way you build it.

TLDR

Proper project setup may require more effort in the beginning, but it will save you months in the long term by reducing the likelihood of a mistake and by simplifying the learning curve for new developers.

Here are some techniques (in random order) that you should consider using in your development process and your project setup. The bolded items will be the focus of this blogpost.

  • source code control
  • clear project structure
  • maximum warning level
  • unit tests
  • style consistency
  • continuous integration
  • static code analysis
  • dynamic code analysis
  • code reviews
  • benchmarking
  • integration tests

See sample project on Github. It sets warning level to maximum and treats all warnings as errors; it also uses code style analysis as well as unit testing and mocking frameworks. The project was written in C++ (and this blog post is also a bit C++-centric), but everything discussed here is really language agnostic. I used googletest and Google CppLint in my example, but other libraries could be also easily used. If you’re from the C++ world, take a look at great talk by Marshall Clow about project setup from CppCon 2014.

Background

I have seen multiple different project setups and the varying outcomes, that resulted. This topic is extremely important, because proper setup will save development time, reduce the number of errors and inconsistencies, simplify the learning curve for new developers and make everybody’s life easier in general.

There are not so many books that emphasize the importance of this topic. As a new developer, you can easily learn about this by looking at some high-quality projects on GitHub, but you need to know which projects to look at. Plus, if you’re just starting your career, usually you do not focus your attention on project setup approaches. However proper project setup has a huge impact on product quality - just as a foundation impacts the quality of the house it supports.

Let’s go through items from the list above step by step.

Source code control

Not much to comment on here. Fortunately, nowadays everyone is using it. The important thing here is to be consistent with your work-flow (are you using gitflow workflow or something else? Do you have a separate branch for each feature? What versioning approach are you using? Do you create a tag for every release? Do you have a release branch? Or both?).

Clear project structure

By clear I mean that it should be not only intuitive and logical, but that it should also align with worldwide known practices. For example, a common practice is to have folders like src, include, build, test and have a Readme file in the root folder. So it would be unexpected to see the main Readme five directories down or half of the source files in the build folder.

Warnings

Every warning should be treated very seriously and cannot be ignored. Compiler developers put tons of effort and do an amazing job to prevent errors by directing us to potentially dangerous pieces of code.

It’s always a good idea to enable the maximum warning level (/Wall, /Wextra) and enable the setting “treat warnings as errors” (/Werror) in new projects.

I witnessed a situation when a colleague of mine spent hours trying to figure out the reason behind the random crash in a huge project. He didn’t pay attention to warnings, because the project had thousands of them. The answer was simple:

void foo() {
    std::string s("test");
    printf("%s", s);  // Oops! char* (s.c_str()) is expected.
}

This situation occurred long ago. Actually, these days many (but not all) modern compilers would generate an error and not a warning on the printf line. Still, this example illustrates the situation well. If the project was better set up, it would take two seconds to fix it or it would likely not happen at all.

It’s harder with legacy code, which has hundreds of warnings, but I have no doubts, that it’s worthwhile to invest time and clean all those warnings slowly, step by step.

Unit tests

This item is the most important one. If you want to integrate only one item from the list above, integrate this one. Unit tests have been popular for a long time. There are hundreds of books about this topic. All the languages I have worked with (procedural, OO, functional and even very specialized ones like LabVIEW) have free libraries for unit testing. Many languages even come with unit test libraries out of the box; but surprisingly, there are tons of serious projects out there (even new ones, not only legacy projects), which don’t use tests.

Not only do unit tests help you to verify your code through different execution paths and edge cases, prevent regression errors, and make new developers more confident in making changes, but they also force you to design your classes/interfaces/functions better and decouple modules from each other.

Another approach I see very often is writing unit tests, but not using mock objects. Many developers don’t know about them. It’s definitely better having tests without mocks than not having tests at all. However when you don’t use mocks, you are most likely using the Unit Test library to create integration tests. As a result, you don’t test the failure path, you don’t simulate exceptions which may be thrown in the dependency of testable objects, and you don’t decouple modules from each other well enough. The result is likely that the execution time of your test is very long. Mock frameworks are very powerful (and very often free) tools, so there is no reason not to use them. This example of unit test illustrates how valuable mocks can be (MockObject is injected into testable object as its dependency).

Moreover, don’t go to the other extreme - mocks returning mocks returning mocks is a sign of a bad design. This post from Gary Bernhardt explains very well what ‘too many mocks’ means.

Unit test execution should be part of the build.

In my opinion, it’s important to run your tests as part of the build (and keep execution time short enough). If your total test execution time takes hours, then nobody will bother running tests before pushing new changes.

Tests almost always take a long time (let’s exclude situations when your project has hundred millions lines of code and you have millions of tests) because you need to create and then remove databases, or send a network request. This means, you’re writing an integration test. It may be useful to have integration tests, but they should be an addition to the unit tests, not a replacement. Nevertheless, integration tests are not in the scope of this post.

Unlike in the past, everyone uses source code control these days. I have no doubts, that eventually unit tests will be where source code control is now - everywhere. Start using it today (if you aren’t already) and your product quality will improve dramatically, allowing you to pull ahead of your competitors.

Style consistency

Do you like seeing spaces and tabs in the same file? Do you enjoy having CamelCase and underscore_style classes/methods/variables in the same project? What about methods which are two thousand lines long, half of those are commented out? Or ten empty lines between two lines of code?

I know people who don’t care about that, but I don’t really know anyone who enjoys it. Most people want the code to be consistent. Consistency makes code easier to read and maintain. If you’re using well-known code style practices (which you should), it will make the life of new developers much easier. Yet, it’s hard to be consistent even when you are the only one who writes the whole project. I believe that the only way to achieve style consistency is to use a tool that analyzes all your files. It is important to run this tool as a part of your build and then treat output of the tool as compilation errors. All the languages that I know have such tools (often you have many of them for your language): StyleCop for C#, Vera++ or Google CppLint for C++, Go lang comes with a style checker out of the box, etc.

Don’t worry if you may have to customize or disable some rules. Every team/project has different preferences.

In my example project I used Google CppLint and I enabled usage of C++11 features, streams and references. It also worth the mention that if you’re from C++ word, it’s a good idea to keep an eye on core guidelines and the automation tools that come with them.

Conclusion

Chaos creates chaos. If from the start your project has multiple warnings, no tests, tricky project structure and inconsistent style, all future commits will look the same. The probability of mistakes will be high, even if you hire top-notch developers to add a new feature.

Invest time in the project setup from the very beginning and always pay attention to the structure of great projects in order to learn from them.

Best Practicescpp

Serverless with AWS Lambda and API Gateway: not a beginner tutorial

Raw pointers in modern C++