Skip to content

[docs] Rewrite HowToCrossCompileLLVM #129451

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Mar 11, 2025
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
331 changes: 177 additions & 154 deletions llvm/docs/HowToCrossCompileLLVM.rst
Original file line number Diff line number Diff line change
@@ -1,215 +1,238 @@
===================================================================
How To Cross-Compile Clang/LLVM using Clang/LLVM
How to cross-compile Clang/LLVM using Clang/LLVM
===================================================================

Introduction
============
------------

This document contains information about building LLVM and
Clang on host machine, targeting another platform.
Clang on a host machine, targeting another platform.

For more information on how to use Clang as a cross-compiler,
please check https://clang.llvm.org/docs/CrossCompilation.html.

TODO: Add MIPS and other platforms to this document.
This document describes cross-building a compiler in a single stage, using an
existing ``clang`` install as the host compiler.

Cross-Compiling from x86_64 to ARM
==================================
.. note::
These instructions have been tested for targeting 32-bit ARM, AArch64, or
64-bit RISC-V from an x86_64 Linux host. But should be equally applicable to
any other target.

In this use case, we'll be using CMake and Ninja, on a Debian-based Linux
system, cross-compiling from an x86_64 host (most Intel and AMD chips
nowadays) to a hard-float ARM target (most ARM targets nowadays).

The packages you'll need are:

* ``cmake``
* ``ninja-build`` (from backports in Ubuntu)
* ``gcc-4.7-arm-linux-gnueabihf``
* ``gcc-4.7-multilib-arm-linux-gnueabihf``
* ``binutils-arm-linux-gnueabihf``
* ``libgcc1-armhf-cross``
* ``libsfgcc1-armhf-cross``
* ``libstdc++6-armhf-cross``
* ``libstdc++6-4.7-dev-armhf-cross``

Configuring CMake
-----------------

For more information on how to configure CMake for LLVM/Clang,
see :doc:`CMake`.

The CMake options you need to add are:

* ``-DCMAKE_SYSTEM_NAME=<target-system>``
* ``-DCMAKE_INSTALL_PREFIX=<install-dir>``
* ``-DLLVM_HOST_TRIPLE=arm-linux-gnueabihf``
* ``-DLLVM_TARGETS_TO_BUILD=ARM``

Note: ``CMAKE_CROSSCOMPILING`` is always set automatically when ``CMAKE_SYSTEM_NAME`` is set. Don't put ``-DCMAKE_CROSSCOMPILING=TRUE`` in your options.

Also note that ``LLVM_HOST_TRIPLE`` specifies the triple of the system
that the cross built LLVM is going to run on - the flag is named based
on the autoconf build/host/target nomenclature. (This flag implicitly sets
other defaults, such as ``LLVM_DEFAULT_TARGET_TRIPLE``.)
Setting up a sysroot
--------------------

If you're compiling with GCC, you can use architecture options for your target,
and the compiler driver will detect everything that it needs:
You will need a sysroot that contains essential build dependencies compiled
for the target architecture. In this case, we will be using CMake and Ninja on
a Linux host and compiling against a Debian sysroot. Detailed instructions on
producing sysroots are outside of the scope of this documentation, but the
following instructions should work on any Linux distribution with these
pre-requisites:

* ``-DCMAKE_CXX_FLAGS='-march=armv7-a -mcpu=cortex-a9 -mfloat-abi=hard'``
* ``binfmt_misc`` configured to execute ``qemu-user`` for binaries of the
target architecture. This is done by installing the ``qemu-user-static``
and ``binfmt-support`` packages on Debian-derived distributions.
* Root access (setups involving ``proot`` or other tools to avoid this
requirement may be possible, but aren't described here).
* The ``debootstrap`` tool. This is available in most distributions.

However, if you're using Clang, the driver might not be up-to-date with your
specific Linux distribution, version or GCC layout, so you'll need to fudge.
The following snippet will initialise sysroots for 32-bit Arm, AArch64, and
64-bit RISC-V (just pick the target(s) you are interested in):

In addition to the ones above, you'll also need:
.. code-block:: bash

* ``--target=arm-linux-gnueabihf`` or whatever is the triple of your cross GCC.
* ``'--sysroot=/usr/arm-linux-gnueabihf'``, ``'--sysroot=/opt/gcc/arm-linux-gnueabihf'``
or whatever is the location of your GCC's sysroot (where /lib, /bin etc are).
* Appropriate use of ``-I`` and ``-L``, depending on how the cross GCC is installed,
and where are the libraries and headers.
sudo debootstrap --arch=armhf --variant=minbase --include=build-essential,symlinks stable sysroot-deb-armhf-stable
sudo debootstrap --arch=arm64 --variant=minbase --include=build-essential,symlinks stable sysroot-deb-arm64-stable
sudo debootstrap --arch=riscv64 --variant=minbase --include=build-essential,symlinks unstable sysroot-deb-riscv64-unstable

You may also want to set the ``LLVM_NATIVE_TOOL_DIR`` option - pointing
at a directory with prebuilt LLVM tools (``llvm-tblgen``, ``clang-tblgen``
etc) for the build host, allowing you to them reuse them if available.
E.g. ``-DLLVM_NATIVE_TOOL_DIR=<path-to-native-llvm-build>/bin``.
If the option isn't set (or the directory doesn't contain all needed tools),
the LLVM cross build will automatically launch a nested build to build the
tools that are required.
The created sysroot may contain absolute symlinks, which will resolve to a
location within the host when accessed during compilation, so we must convert
any absolute symlinks to relative ones:

The CXX flags define the target, cpu (which in this case
defaults to ``fpu=VFP3`` with NEON), and forcing the hard-float ABI. If you're
using Clang as a cross-compiler, you will *also* have to set ``--sysroot``
to make sure it picks the correct linker.
.. code-block:: bash

When using Clang, it's important that you choose the triple to be *identical*
to the GCC triple and the sysroot. This will make it easier for Clang to
find the correct tools and include headers. But that won't mean all headers and
libraries will be found. You'll still need to use ``-I`` and ``-L`` to locate
those extra ones, depending on your distribution.
sudo chroot sysroot-of-your-choice symlinks -cr .

Most of the time, what you want is to have a native compiler to the
platform itself, but not others. So there's rarely a point in compiling
all back-ends. For that reason, you should also set the
``TARGETS_TO_BUILD`` to only build the back-end you're targeting to.

You must set the ``CMAKE_INSTALL_PREFIX``, otherwise a ``ninja install``
will copy ARM binaries to your root filesystem, which is not what you
want.
Configuring CMake and building
------------------------------

Hacks
-----
For more information on how to configure CMake for LLVM/Clang,
see :doc:`CMake`. Following CMake's recommended practice, we will create a
`toolchain file
<https://cmake.org/cmake/help/book/mastering-cmake/chapter/Cross%20Compiling%20With%20CMake.html#toolchain-files>`_.

There are some bugs in current LLVM, which require some fiddling before
running CMake:
The following assumes you have a system install of ``clang`` and ``lld`` that
will be used for cross compiling and that the listed commands are executed
from within the root of a checkout of the ``llvm-project`` git repository.

#. If you're using Clang as the cross-compiler, there is a problem in
the LLVM ARM back-end that is producing absolute relocations on
position-independent code (``R_ARM_THM_MOVW_ABS_NC``), so for now, you
should disable PIC:
First, set variables in your shell session that will be used throughout the
build instructions:

.. code-block:: bash

-DLLVM_ENABLE_PIC=False
SYSROOT=$HOME/sysroot-deb-arm64-stable
TARGET=aarch64-linux-gnu
CFLAGS=""

This is not a problem, since Clang/LLVM libraries are statically
linked anyway, it shouldn't affect much.
To customise details of the compilation target or choose a different
architecture altogether, change the ``SYSROOT``,
``TARGET``, and ``CFLAGS`` variables to something matching your target. For
example, for 64-bit RISC-V you might set
``SYSROOT=$HOME/sysroot-deb-riscv64-unstable``, ``TARGET=riscv64-linux-gnu``
and ``CFLAGS="-march=rva20u64"``. Refer to documentation such as your target's
compiler documentation or processor manual for guidance on which ``CFLAGS``
settings may be appropriate. The specified ``TARGET`` should match the triple
used within the sysroot (i.e. ``$SYSROOT/usr/lib/$TARGET`` should exist).

#. The ARM libraries won't be installed in your system.
But the CMake prepare step, which checks for
dependencies, will check the *host* libraries, not the *target*
ones. Below there's a list of some dependencies, but your project could
have more, or this document could be outdated. You'll see the errors
while linking as an indication of that.
Then execute the following snippet to create a toolchain file:

Debian based distros have a way to add ``multiarch``, which adds
a new architecture and allows you to install packages for those
systems. See https://wiki.debian.org/Multiarch/HOWTO for more info.
.. code-block:: bash

But not all distros will have that, and possibly not an easy way to
install them in any anyway, so you'll have to build/download
them separately.
cat - <<EOF > $TARGET-clang.cmake
set(CMAKE_SYSTEM_NAME Linux)
set(CMAKE_SYSROOT "$SYSROOT")
set(CMAKE_C_COMPILER_TARGET $TARGET)
set(CMAKE_CXX_COMPILER_TARGET $TARGET)
set(CMAKE_C_FLAGS_INIT "$CFLAGS")
set(CMAKE_CXX_FLAGS_INIT "$CFLAGS")
set(CMAKE_LINKER_TYPE LLD)
set(CMAKE_C_COMPILER clang)
set(CMAKE_CXX_COMPILER clang++)
set(CMAKE_FIND_ROOT_PATH_MODE_PROGRAM NEVER)
set(CMAKE_FIND_ROOT_PATH_MODE_LIBRARY ONLY)
set(CMAKE_FIND_ROOT_PATH_MODE_INCLUDE ONLY)
set(CMAKE_FIND_ROOT_PATH_MODE_PACKAGE ONLY)
EOF


Then configure and build by invoking ``cmake``:

A quick way of getting the libraries is to download them from
a distribution repository, like Debian (http://packages.debian.org/jessie/),
and download the missing libraries. Note that the ``libXXX``
will have the shared objects (``.so``) and the ``libXXX-dev`` will
give you the headers and the static (``.a``) library. Just in
case, download both.
.. code-block:: bash

The ones you need for ARM are: ``libtinfo``, ``zlib1g``,
``libxml2`` and ``liblzma``. In the Debian repository you'll
find downloads for all architectures.
cmake -G Ninja \
-DCMAKE_BUILD_TYPE=Release \
-DLLVM_ENABLE_PROJECTS="lld;clang" \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

worth mentioning compiler-rt (and libc++)? it could be challenging to set up and readers might want to know how to do it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that would be a great follow-up improvement. My goal with this PR is to make something that's covers the same content as the original documentation, but in a better way.

-DCMAKE_TOOLCHAIN_FILE=$(pwd)/$TARGET-clang.cmake \
-DLLVM_HOST_TRIPLE=$TARGET \
-DCMAKE_INSTALL_PREFIX=$HOME/clang-$TARGET \
-S llvm \
-B build/$TARGET
cmake --build build/$TARGET

These options from the toolchain file and ``cmake`` invocation above are
important:

* ``CMAKE_SYSTEM_NAME``: Perhaps surprisingly, explicitly setting this
variable `causes CMake to set
CMAKE_CROSSCOMPIILING <https://cmake.org/cmake/help/latest/variable/CMAKE_CROSSCOMPILING.html#variable:CMAKE_CROSSCOMPILING>`_.
* ``CMAKE_{C,CXX}_COMPILER_TARGET``: This will be used to set the
``--target`` argument to ``clang``. The triple should match the triple used
within the sysroot (i.e. ``$SYSROOT/usr/lib/$TARGET`` should exist).
* ``CMAKE_FIND_ROOT_PATH_MODE_*``: These `control the search behaviour for
finding libraries, includes or binaries
<https://cmake.org/cmake/help/book/mastering-cmake/chapter/Cross%20Compiling%20With%20CMake.html#finding-external-libraries-programs-and-other-files>`_.
Setting these prevents files for the host being used in the build.
* ``LLVM_HOST_TRIPLE``: Specifies the target triple of the system the built
LLVM will run on, which also implicitly sets other defaults such as
``LLVM_DEFAULT_TARGET_TRIPLE``. For example, if you are using an x86_64
host to compile for RISC-V, this will be a RISC-V triple.
* ``CMAKE_SYSROOT``: The path to the sysroot containing libraries and headers
for the target.
* ``CMAKE_INSTALL_PREFIX``: Setting this avoids installing binaries compiled
for the target system into system directories for the host system. It is
not required unless you are going to use the ``install`` target.

See `LLVM's build documentation
<https://llvm.org/docs/CMake.html#frequently-used-cmake-variables>`_ for more
guidance on CMake variables (e.g. ``LLVM_TARGETS_TO_BUILD`` may be useful if
your cross-compiled binaries only need to support compiling for one target).

Working around a ninja dependency issue
---------------------------------------

If you followed the instructions above to create a sysroot, you may run into a
`longstanding problem related to path canonicalization in ninja
<https://github.com/ninja-build/ninja/issues/1330>_`. GCC canonicalizes system
headers in dependency files, so when ninja reads them it does not need to do
so. Clang does not do this, and unfortunately ninja does not implement the
canonicalization logic at all, meaning for some system headers with symlinks
in the paths, it can incorrectly compute a non-existing path and consider it
as always modified.

If you are suffering from this issue, you will find any attempt at an
incremental build (including the suggested command to build the ``install``
target in the next section) results in recompiling everything. ``ninja -C
build/$TARGET -t deps`` shows files in ``$SYSROOT/include/*`` that
do not exist (as the ``$SYSROOT/include`` folder does not exist) and you can
further confirm these files are causing ``ninja`` to determine a rebuild is
necessary with ``ninja -C build/$TARGET -d deps``.

A workaround is to create a symlink so that the incorrect
``$SYSROOT/include/*`` dependencies resolve to files within
``$SYSROOT/usr/include/*``. This works in practice for the simple
cross-compilation use case described here, but is not a general solution.

After you download and unpack all ``.deb`` packages, copy all
``.so`` and ``.a`` to a directory, make the appropriate
symbolic links (if necessary), and add the relevant ``-L``
and ``-I`` paths to ``-DCMAKE_CXX_FLAGS`` above.
.. code-block:: bash

sudo ln -s usr/include $SYSROOT/include

Running CMake and Building
--------------------------
Testing the just-built compiler
-------------------------------

Finally, if you're using your platform compiler, run:
Confirm the ``clang`` binary was built for the expected target architecture:

.. code-block:: bash

$ cmake -G Ninja <source-dir> -DCMAKE_BUILD_TYPE=<type> <options above>
$ file -L ./build/aarch64-linux-gnu/bin/clang
./build/aarch64-linux-gnu/bin/clang: ELF 64-bit LSB pie executable, ARM aarch64, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux-aarch64.so.1, for GNU/Linux 3.7.0, BuildID[sha1]=516b8b366a790fcd3563bee4aec0cdfcb90bb1c7, not stripped

If you're using Clang as the cross-compiler, run:
If you have ``qemu-user`` installed you can test the produced target binary
either by invoking ``qemu-{target}-static`` directly:

.. code-block:: bash

$ CC='clang' CXX='clang++' cmake -G Ninja <source-dir> -DCMAKE_BUILD_TYPE=<type> <options above>

If you have ``clang``/``clang++`` on the path, it should just work, and special
Ninja files will be created in the build directory. I strongly suggest
you to run ``cmake`` on a separate build directory, *not* inside the
source tree.
$ qemu-aarch64-static -L $SYSROOT ./build/aarch64-linux-gnu/bin/clang --version
clang version 21.0.0git (https://github.com/llvm/llvm-project cedfdc6e889c5c614a953ed1f44bcb45a405f8da)
Target: aarch64-unknown-linux-gnu
Thread model: posix
InstalledDir: /home/asb/llvm-project/build/aarch64-linux-gnu/bin

To build, simply type:
Or, if binfmt_misc is configured (as was necessary for debootstrap):

.. code-block:: bash

$ ninja
$ export QEMU_LD_PREFIX=$SYSROOT; ./build/aarch64-linux-gnu/bin/clang --version
clang version 21.0.0git (https://github.com/llvm/llvm-project cedfdc6e889c5c614a953ed1f44bcb45a405f8da)
Target: aarch64-unknown-linux-gnu
Thread model: posix
InstalledDir: /home/asb/llvm-project/build/aarch64-linux-gnu/bin

It should automatically find out how many cores you have, what are
the rules that needs building and will build the whole thing.

You can't run ``ninja check-all`` on this tree because the created
binaries are targeted to ARM, not x86_64.

Installing and Using
Installing and using
--------------------

After the LLVM/Clang has built successfully, you should install it
via:

.. code-block:: bash

$ ninja install
.. note::
Use of the ``install`` target requires that you have set
``CMAKE_INSTALL_PREFIX`` otherwise it will attempt to install in
directories under `/` on your host.
Comment on lines +214 to +217
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another variable that's relevant and worth mentioning here is DESTDIR. The directory install target will install files to is $DESTDIR/$CMAKE_INSTALL_PREFIX so even if CMAKE_INSTALL_PREFIX is set to / you can still change the destination directory to avoid overwriting files under / on your host.

This is commonly used by package managers, e.g. you might set CMAKE_INSTALL_PREFIX to /usr and DESTDIR to /path/to/stage.


which will create a sysroot on the install-dir. You can then tar
that directory into a binary with the full triple name (for easy
identification), like:
If you want to transfer a copy of the built compiler to another machine, you
can first install it to a location on the host via:

.. code-block:: bash

$ ln -sf <install-dir> arm-linux-gnueabihf-clang
$ tar zchf arm-linux-gnueabihf-clang.tar.gz arm-linux-gnueabihf-clang
cmake --build build/$TARGET --target=install

If you copy that tarball to your target board, you'll be able to use
it for running the test-suite, for example. Follow the guidelines at
https://llvm.org/docs/lnt/quickstart.html, unpack the tarball in the
test directory, and use options:
This will install the LLVM/Clang headers, binaries, libraries, and other files
to paths within ``CMAKE_INSTALL_PREFIX``. Then tar that directory for transfer
to a device that runs the target architecture natively:

.. code-block:: bash

$ ./sandbox/bin/python sandbox/bin/lnt runtest nt \
--sandbox sandbox \
--test-suite `pwd`/test-suite \
--cc `pwd`/arm-linux-gnueabihf-clang/bin/clang \
--cxx `pwd`/arm-linux-gnueabihf-clang/bin/clang++
tar -czvf clang-$TARGET.tar.gz -C $HOME clang-$TARGET

Remember to add the ``-jN`` options to ``lnt`` to the number of CPUs
on your board. Also, the path to your clang has to be absolute, so
you'll need the `pwd` trick above.
The generated toolchain is portable, but requires compatible versions of any
shared libraries it links against. This means using a sysroot that is as
similar to your target operating system as possible is desirable. Other `CMake
variables <https://llvm.org/docs/CMake.html#frequently-used-cmake-variables>`_
may be helpful, for instance ``LLVM_STATIC_LINK_CXX_STDLIB``.