Building = Preprocessing + Compiling + Linking

After answering the same questions over and over and over and over again especially about compiling and linking, I thought it might be time to write down a few things, that I and maybe others could point to instead of keep repeating ourselves in the future. Given the wideness topic I’ll be writing things down in multiple parts. In this first post I’ll be explaining a few basic terms and their practical applications, the next one will go more into details on how linking against external libraries work and in the third and last part I’ll be writing down answers to some common issues. I hope this will turn out as a timesaver for all parties. – As a disclaimer I should add, that I’m a human being that makes mistakes and can have wrong models in their mind, thus if I make a mistake feel free to point them out.

Part I – Terms and Explanations

Although this should essentially be covered in any basic programming book, even the bad ones, here we go with a few “definitions”:

Application

Looking at it in a very abstract way, an application is no different than anything else on your hard disk, because it’s just ones and zeroes, the important part however is how these numbers get interpreted. Your CPU (the heart & brain of your PC) doesn’t understand the ones and zeroes from a music file, but if you give it the data of an application it can translate them into actions that will do what the programmer intended to (hopefully).

Building

To get such an application, your C++ code has to be turned into ones and zeroes, also known as machine code. The full processes of doing so is called building and is as such more an abstract term, because the process is essentially a composition of mainly three sub-processes: Preprocessing, compiling and linking.

Header File (.hpp / .h)

If you’ve done your job right, i.e. followed the correct way of writing code, then you should have all the declarations of functions, classes, structs, enumerates, templates, etc. in header files. With the help of a header file, the compiler will know how much stack and memory space is needed for the machine code that is about to be generated.

Source File (*.cpp)

The source file holds most of the implementation details, also known as code definitions. It’s where the behavior of the application itself is crated, meaning how the declared classes and functions interact with each other.

Inline File (*.inl)

This is not always needed and essentially optional, but it’s a common and good practice to put code definitions, that are required to be in the header file, into an external file and include it at the needed position in the header file. The most common use of inline files is for template definitions, which need to be in the header given that they get executed at compile time and can influence the memory layout and footprint.

Preprocessing

Before your code gets translated into anything the preprocessor will kick-in. All the statements that start with a number sign are actually preprocessor directives and get resolved in the first step of building an application, for example all the #include will get replaced with the actual content of the included header file or for any #define X Z all X will get replaced with Z or any kind of macro will get executed.

Compiling

Now that one source file will hold all required declarations the compilation can begin. Compiling is one of the terms that confuses a lot of people since it’s often used synonymously to building, which in turn isn’t too surprising given that compiling is the most crucial part, since it actually translates all the source into machine code. But you don’t end up with an application itself, but instead you end up with an object file for each source file.

Object File (.o / .obj)

Object files contain the translated source code in machine code. While the CPU essentially would understand the commands of the binary data, it would not know where to start exactly and miss other referenced code pieces. Thus these object files are only the building blocks used by the linker to generate the final executable.

Linking

As pointed out above the linker will be combining the object files generate by the compiler. It’s called linking because it obviously doesn’t just append or prepend the files, but actually makes “smart” links between parts of the object files – for details you might want to refer to a linker manual or other sources. But the linker doesn’t just link object files, it also links in libraries and it’s here where the difference between static and dynamic libraries are important, but I’ll go into more details in the next part, for now I’ll just say that static libraries are actually just archives of object files and thus can be linked in directly, while for dynamic libraries one just specify the interface and the “connection” between the application and the external code will be made at runtime.

Integrated Development Environments

This essentially doesn’t have anything to do with building applications, but it’s exactly that point that gets confused often. Integrated Development Environments short IDE are simply a set of tools created to work nicely (integrated) together and assist the programmer in creating code. A compiler and a linker can be considered part of this tool set, but it’s not the IDE that builds anything. Some of the most common IDEs would be Visual Studio, Code::Blocks, Qt Creator and Xcode. However one exception remains and that is, the Visual Studio compiler doesn’t really have a name of its own. Sometimes one refers to it as MSVC (Microsoft Visual C++), which in turn could also mean the IDE itself. What’s certain is, that one should not use “I built this game with Code::Blocks”, but instead “I built this game with MinGW” or “GCC”.

Final Thoughts

Having reached the end of the first part, I hope this basic information will be of some help to new beginners and depending on the response or future experience, I might add one or two paragraphs. The second part already has some words written for, so stay tuned!

2 thoughts on “Building = Preprocessing + Compiling + Linking”

Rosme says:

March 26, 2014 at 3:49 pm

Nice article. Something that is often being looked over in beginner tutorials.

1. eXpl0it3r says:
  
  March 26, 2014 at 4:10 pm
  
  Thanks! :)
  Most books usually deal one or the other way with these terms, so that should be basic knowledge, however the next two parts are hardly ever covered in C++ books. Linking dependencies gets often written off as something too OS specific or too distant from C++ (the language) itself. IMHO that’s quite bad, since linking is one of the most important things in C++ programming – don’t reinvent the wheel!