Master's Thesis

Evaluating Value-Wise Poison Values for the LLVM Compiler

Filipe Parrado de Azevedo2020

Key information

Authors:

Filipe Parrado de Azevedo (Filipe Parrado de Azevedo)

Supervisors:

José Carlos Alves Pereira Monteiro (José Carlos Alves Pereira Monteiro); Nuno Claudino Pereira Lopes

Published in

28 de abril de 2020

Abstract

The Intermediate Representation (IR) of a compiler has become an important aspect of optimizing compilers in recent years. The IR of a compiler should make it easy to perform transformations while also giving portability to the compiler. One aspect of IR design is the role of Undefined Behavior (UB). UB is important to reflect the semantics of UB-heavy programming languages, like C and C++, namely allowing multiple desirable optimizations to be made and the modeling of unsafe low-level operations. Consequently, the IR of important compilers, such as LLVM or GCC, supports one or more forms of UB. In this work we focus on the LLVM compiler infrastructure and how it deals with UB in its IR, with the concepts of “poison” and “undef”, and how the existence of multiple forms of UB conflict with each other and cause problems to very important “textbook” optimizations, such as some forms of “Global Value Numbering” and “Loop Unswitching”, hoisting operations past control-flow, among others. To solve these problems we introduce a new semantics of UB to the LLVM, explaining how it can solve the different problems stated, while most optimizations currently in LLVM remain sound. Part of the implementation of the new semantics is the introduction of a new type of structure to the LLVM IR – Explicitly Packed Structure type – that represents each field in its own integer type with size equal to that of the field in the source code. Our implementation does not degrade the performance of the compiler.

Publication details

Authors in the community:

Supervisors of this institution:

Fields of Science and Technology (FOS)

electrical-engineering-electronic-engineering-information-engineering - Electrical engineering, electronic engineering, information engineering

Publication language (ISO code)

eng - English

Rights type:

Embargo lifted

Date available:

20 de maio de 2021

Institution name

Instituto Superior Técnico