Build a Compiler — Backend

Marc Auberer
4 min readApr 2, 2024

--

The job of the compiler backend is the transformation of (optimized or unoptimized) IR to the machine code for the target platform. This usually contains the following steps in no particular order:

  • Target Code Generation:
    The backend takes the platform-agnostic IR and translates it into the machine code instructions specific to the target processor architecture. This step must account for the peculiarities of the hardware, such as the instruction set architecture (ISA), available registers, and the specifics of memory access patterns.
  • Instruction Selection:
    This involves choosing the specific machine instructions that will be used to implement the operations described in the IR. Different instructions might be more efficient in certain contexts, so the backend must choose the best options based on the targets hardware capabilities and constraints.
  • Register Allocation:
    The backend must manage the allocation of the target machine’s limited set of registers. This involves mapping the potentially unlimited virtual registers, used in the IR, to the physical registers of the target architecture. Efficient register allocation is crucial for performance, as it can minimize memory accesses, which are typically slower than register operations.
  • Instruction Scheduling:
    The order in which instructions are executed can significantly impact performance, especially on modern processors with pipelining and super-scalar execution. The backend reorders instructions to reduce pipeline stalls and improve instruction-level parallelism, taking into account dependencies between instructions and potential execution hazards.
  • Further Optimization:
    Although many optimizations are performed at the IR level in the compiler’s middle-end, the backend also performs target-specific optimizations. These can include peephole optimizations (small, local transformations to generate more efficient code sequences), loop unrolling (to reduce the overhead of loop control), and others designed to exploit specific hardware features.
  • Handling of Low-Level Details:
    This includes the generation of assembly directives, data layout for variables and arrays, alignment constraints, and the management of calling conventions for function calls. The backend must also handle the details of interfacing with the operating system, such as system call conventions and the setup of execution environments.

Compiler backends are highly specialized for their target architectures. This specialization allows them to take full advantage of the unique features and capabilities of the hardware, thereby generating the most efficient executable code possible from the given intermediate representation. In LLVM, the backend architecture is designed to be modular, allowing for the support of multiple target architectures while sharing as much of the frontend and optimization phases as possible. LLVM offers support for the most common CPU and GPU architectures out of the box.

The compiler backend produces an object file for each compile unit. If the compile units are related, there is a high chance, that there are dependencies between those object files. To resolve these dependencies and obtain an executable program, these object files need to be linked together. This is where the linker comes into play.

https://media.geeksforgeeks.org/wp-content/uploads/20200808221828/llgfg.png [accessed 01 Apr, 2024]

The job of the linker is to combine the object files and ensure that all required symbols are present in the executable. Another aspect is the relocation of addresses. The linker assigns definite addresses, adjusting the code and data references to reflect where everything will reside in memory when executed. The linker can also perform another level of optimization, so called Link Time Optimization (LTO). As the linker sees the program as a whole, it is able to perform more sophisticated analysis along with cross-module optimizations like dead function elimination, cross-module inlining or Interprocedural Optimizations (IPO), such as reordering instructions, merging similar functions, etc.

Exercise

Create the two source files ObjectEmitter.h and ObjectEmitter.cpp and implement the class ObjectEmitter, which inherits from CompilerPass. Implement two methods:

  • emit()
  • link()

This code can be used to emit an object file from an optimized LLVM module:

// Open file output stream
std::error_code errorCode;
llvm::raw_fd_ostream stream(objectFile.string(), errorCode, llvm::sys::fs::OF_None);
if (errorCode)
throw std::runtime_error("File '" + objectFile.string() + "' could not be opened");

llvm::legacy::PassManager passManager;
if (targetMachine->addPassesToEmitFile(passManager, stream, nullptr, llvm::CodeGenFileType::ObjectFile, false))
throw std::runtime_error("Target machine can't emit a file of this type");

// Emit object file
passManager.run(llvmModule);
stream.flush();

The link method only contains boilerplate to indirectly call a linker.

Et voilà! Here we have our executable binary, that we can ship to the target machine.

Thank you for reading this article series and doing the exercises! If you are interested in more, please visit my socials:

--

--

Marc Auberer

C++ Compiler Developer at SAP. Passionate software developer and OpenSource enthusiast.