Building LLVM Compiler-RT for ARM MCU

Jan 25, 2022·
Zhiyao Ma
· 3 min read

What are compiler intrinsic functions?

Modern compilers perform extensive optimizations during code generation. They pick the instructions that run the fastest, or are the smallest in size, while preserving the semantics of the source code written in high level languages. Surprisingly, compilers may even substitude a sequence of instructions with a function call. They assume a set of functions are at their discretion, namely intrinsic functions or built-in functions.

When the compilers deem profitable, they insert calls to intrinsic functions, as they presume that intrinsic functions have highly optimized and thoroughly tested implementation.

The code piece below shows a concrete example.

/* Source file: temp.c */

void my_memset(void *ptr, char c, unsigned len) {
    char *cptr = (char *) ptr;
    for (; len != 0; --len)
        *cptr++ = c;
}

Let us compile it with optimization and disassemble the generated object file. One could choose any optimization level among Og, O1, O2, O3 and Os. Below we show the result compiled with clang -Os -nostdlib --target=armv7em-none-eabi -mcpu=cortex-m4 -c -o temp.o temp.c, and disassembled by arm-none-eabi-objdump -d temp.o.

00000000 <my_memset>:
  0:  b142        cbz r2, 14 <my_memset+0x14> # r2 holds "len"
                                              # if "len" is 0, goto 14:
  2:  b580        push {r7, lr}               # preserve registers
  4:  466f        mov r7, sp                  # redundant instruction
  6:  460b        mov r3, r1                  # shuffle arguments so to meet
  8:  4611        mov r1, r2                  # __aeabi_memset's expectation
  a:  461a        mov r2, r3                  #
  c:  f7ff fffe   bl  0 <__aeabi_memset>      # *call intrinsic function*
 10:  e8bd 4080   ldmia.w sp!, {r7, lr}       # restore registers
 14:  4770        bx  lr                      # return

Clang delegates the heavy work to __aeabi_memset(), an intrinsic function assumed to exist. Unfortunately, the assumption fails sometimes, especially when we cross-compile like above. To satisfy the assumption, we must compile the library containing all instinsic functions and link them manually.


Building LLVM compiler-rt

Clone the LLVM project to our local machine.

git clone https://github.com/llvm/llvm-project.git

Create and change to a build directory.

cd llvm-project
mkdir build-compiler-rt
cd build-compiler-rt

Configure the build. Provide the variables according to your environment.

cmake ../compiler-rt \
    -DCMAKE_INSTALL_PREFIX=${DIR_TO_INSTALL} \
    -DCMAKE_TRY_COMPILE_TARGET_TYPE=STATIC_LIBRARY \
    -DCOMPILER_RT_OS_DIR="baremetal" \
    -DCOMPILER_RT_BUILD_BUILTINS=ON \
    -DCOMPILER_RT_BUILD_SANITIZERS=OFF \
    -DCOMPILER_RT_BUILD_XRAY=OFF \
    -DCOMPILER_RT_BUILD_LIBFUZZER=OFF \
    -DCOMPILER_RT_BUILD_PROFILE=OFF \
    -DCMAKE_C_COMPILER=${LLVM_BIN_PATH}/clang \
    -DCMAKE_C_COMPILER_TARGET="arm-none-eabi" \
    -DCMAKE_ASM_COMPILER_TARGET="arm-none-eabi" \
    -DCMAKE_AR=${LLVM_BIN_PATH}/llvm-ar \
    -DCMAKE_NM=${LLVM_BIN_PATH}/llvm-nm \
    -DCMAKE_RANLIB=${LLVM_BIN_PATH}/llvm-ranlib \
    -DCOMPILER_RT_BAREMETAL_BUILD=ON \
    -DCOMPILER_RT_DEFAULT_TARGET_ONLY=ON \
    -DLLVM_CONFIG_PATH=${LLVM_BIN_PATH}/llvm-config \
    -DCMAKE_C_FLAGS="--target=arm-none-eabi -march=armv7em" \
    -DCMAKE_ASM_FLAGS="--target=arm-none-eabi -march=armv7em"

Compile and install the library.

make && make install

If everything goes on well, we should now see the compiled library at

${DIR_TO_INSTALL}/lib/baremetal/libclang_rt.builtins-arm.a

Linking to it should resolve all missing definitions of __aeabi_*.


Now undefined symbol to memset()?

Quoting from the LLVM mailing list:

In a nutshell, Compiler-RT may assume there is a C library underneath. […] This also works on free-standing environments (ex. the Linux kernel) because those environments assume the compiler library will do so, and thus implement “memcpy”, “memset”, etc.

So we just need to provide additionally the definitions of memcpy(), memmove(), memset(), and memclr().

A prudent reader might now worry that if we write the definition of memset() in C, the compiler may transform our code to call __aeabi_memset(), which in turn calls memset(), thus forming a dead loop. But clang is smart enough to detect that we are providing the definition of memset() so it refrains from generating intrinsic function calls. Brilliant!


What about GCC?

Up until now we have been talking about LLVM/Clang. Things in GCC are quite similar. GCC generates calls to intrinsic functions defined in libgcc. It also assumes the existence of libc. More information here.