The Raspberry Pi is an amazing computer. With this little device you can build and run C code, play with Python, connect to external hardware, hack Linux, create a home server, power robots, and do many more things. When I first started with Raspberry Pi, I didn’t know that I could write my source code from the comfort of my laptop, compile it, and then transfer the executable to the Pi. On the one hand, it’s great that you can run GCC directly on a Raspberry: if you have an idea you can quickly prototype it without having to worry about setting up a build environment. However, if you’re working on anything that will take longer than an afternoon to complete, you’ll probably want to use a fast PC that runs your favorite text editor, that lets you have multiple web browser windows open, and most importantly, that will compile your code quickly. To do that, you can set up a cross compiling container that can output an executable for your Raspberry Pi.

If you just want to jump straight to the do and don’t worry too much about the how, you can check out my GitHub repository for this project.

Stating our goal

Let’s say we have just installed the latest Raspberry Pi OS on a brand new Pi. At the time of writing, that would be the August 2020 version running on a Raspberry Pi 4 model B. As soon as you boot your Pi for the first time, you’ll have GCC 8.3.0 installed on it. By default, this version of GCC compiles code for ARMv6. The goal here is to install the appropriate compiler in a container to generate the same executable as you would if you compiled your code in a Raspberry Pi.

Building the container

I’ll use Docker to containerize our build environment. The Raspberry Pi Foundation recommends Ubuntu to cross compile a kernel for Raspberry Pi, so I figured that would be my starting point. However, after a lot of trial and error, I realized it would be easier if I used Debian; Raspberry Pi OS is much closer to Debian than it is to Ubuntu, and it seems to be easier to get the right version of the software I need to install on Debian, so that’s what I’ll show here.

At the bare minimum, we will need to install gcc-arm-linux-gnueabihf and binutils-arm-linux-gnueabi. gcc-arm-linux-gnueabihf is the GNU C cross-compiler for the armhf architecture. We can double check that the Raspberry Pi 4’s architecture is that one by typing dpkg --print-architecture in it. I’ll also download the same version of GCC as the one that comes with Raspberry Pi OS, gcc-8-arm-linux-gnueabihf. binutils-arm-linux-gnueabi is necessary to cross-compile arm-linux-gnueabi programs.

Our Docker container will look something like this:

FROM debian:10-slim

RUN apt-get update \
    && apt-get -y install \
    make \
    libc6=2.28-10 \
    gcc-8-arm-linux-gnueabihf \
    binutils-arm-linux-gnueabihf

I specifically chose Debian 10 to match the version of Raspberry Pi OS that I’m running, which is based on Debian 10 (also known as Debian buster). The Pi is running GCC 8.3.0-6+rpi1 and Binutils 2.31.1 and comes with GLIBC 2.28-10+rpi1. The Debian container has GCC Debian 8.3.0-2 and Binutils 2.31.1-16 installed, so all in all, it seems pretty close to what the Pi has.

Testing the cross compiler

To test that it works, I’ll compile an empty C program, which only contains the following lines:

int main() {}

To make things easier, I’ll write a short Makefile to build all the artifacts:

.PHONY: clean

assembly:
    @mkdir -p build
    arm-linux-gnueabihf-gcc-8 \
    -S -march=armv6 -marm -mfpu=vfp -masm-syntax-unified \
    main.c \
    -o build/main.s \
    -O0

object:
    @mkdir -p build
    arm-linux-gnueabihf-gcc-8 \
    -march=armv6 -marm -mfpu=vfp \
    -c build/main.s \
    -o build/main.o \
    -O0

executable:
    @mkdir -p build
    arm-linux-gnueabihf-gcc-8 \
    -march=armv6 -marm -mfpu=vfp \
    build/main.s \
    -o build/main \
    -O0

clean:
    rm -f build/*

By default, the compiler that we have installed in our Debian container will compile software for ARMv7, but we want to build it for AMRv6 (just to match the default architecture in the GCC compiler that comes with the Raspberry Pi). If you choose to build for ARMv6, the compiler will try to produce Thumb instructions, but this version of Thumb will not be compatible with ARMv6. Therefore, we must choose to generate code that executes in ARM state with -marm.

After examining the assembly code generated by the Raspberry Pi and the Debian compiler, I realized that they also use different versions of the Vector Floating Point, so I added the -mfpu=vfp option to the Makefile, so that Debian would build code for the same VFP version. If you want to see all of GCC’s default options you can type arm-linux-gnueabihf-gcc-8 -v to check them out.

Additionally, I chose to use Unified Assembly Language (UAL) for our assembly code with the -masm-syntax-unified flag. For this example however, it shouldn’t make a difference.

Finally, since I intend to compare the assembly code generated by the Raspberry Pi’s compiler and the cross compiler, I added an option to disable all compiler optimizations with -O0. That might allow me to do the analysis without worrying about the compiler obfuscating the original intent of the source code.

Comparing the assembly code

The cross compiler generates this assembly code:

    .arch armv6
    .eabi_attribute 28, 1
    .eabi_attribute 20, 1
    .eabi_attribute 21, 1
    .eabi_attribute 23, 3
    .eabi_attribute 24, 1
    .eabi_attribute 25, 1
    .eabi_attribute 26, 2
    .eabi_attribute 30, 6
    .eabi_attribute 34, 1
    .eabi_attribute 18, 4
    .file   "main.c"
    .text
    .align  2
    .global main
    .arch armv6
    .syntax unified
    .arm
    .fpu vfp
    .type   main, %function
main:
    @ args = 0, pretend = 0, frame = 0
    @ frame_needed = 1, uses_anonymous_args = 0
    @ link register save eliminated.
    str fp, [sp, #-4]!
    add fp, sp, #0
    mov r3, #0
    mov r0, r3
    add sp, fp, #0
    @ sp needed
    ldr fp, [sp], #4
    bx  lr
    .size   main, .-main
    .ident  "GCC: (Debian 8.3.0-2) 8.3.0"
    .section    .note.GNU-stack,"",%progbits

The assembly generated by the Raspberry Pi only differs on the second-to-last line:

<   .ident  "GCC: (Debian 8.3.0-2) 8.3.0"
---
>   .ident  "GCC: (Raspbian 8.3.0-6+rpi1) 8.3.0"

The .ident directive will only place a tag on the object file, so in terms of functionality, the assembly generated by the cross compiler is the same as the one compiled by the Raspberry Pi.

Comparing the object files

Before doing any linking, it’s good to take a look at the main.o object file. Since the source assembly is (nearly) the same, so are the main.o files generated by the Pi and the cross compiler. A way to compare the object files is with objdump --disassemble --full-contents.

In our example, the only difference is in the .comment section, and the comment comes straight from the information from the .ident directive:

<  0000 00474343 3a202844 65626961 6e20382e  .GCC: (Debian 8.
<  0010 332e302d 32292038 2e332e30 00        3.0-2) 8.3.0.   
---
>  0000 00474343 3a202852 61737062 69616e20  .GCC: (Raspbian 
>  0010 382e332e 302d362b 72706931 2920382e  8.3.0-6+rpi1) 8.
>  0020 332e3000                             3.0. 

Analyzing the executable files

Although our cross-compiled main.o file is the same as the one compiled by the Pi, the final executable is not. The Debian container produces an 8108 byte executable, whereas the Pi makes a 7912 byte binary. Stripping the files results in a 5564 and a 5540 byte executable, so it’s clear that there must be some differences beyond the symbols in the object files.

After disassembling the stripped executable files, I can see that there many differences between the both of them. Just the .text section is full of differences (I’ve included them at the end of this blog post for any assembly nerds ;) ). However, one chunk that is the same in both files is the one that corresponds to the main function in the original assembly code:

push    {fp}    ; (str fp, [sp, #-4]!)
add fp, sp, #0
mov r3, #0
mov r0, r3
add sp, fp, #0
pop {fp}        ; (ldr fp, [sp], #4)
bx  lr

During the linking stage, different instructions are stitched together, but the cross compiler picked different instructions than the Pi’s compiler. While I was building the Docker container, I chose a Linux distribution and packages that were “close enough” to the ones found on the Raspberry Pi. Clearly, close enough is not enough to get the exact same executables, even though they end up producing the same result.

Closing thoughts

My original goal was to cross compile code for the Raspberry Pi and get the same executable as the Pi would build. To do this, I installed a cross compiler, assembler, linker and binary utilities for ARM, plus GLIBC. While the executable does work, it is different than the reference one, so in a future post I will explore how other people have put together a system to build Raspberry Pi software. I suspect that I have ignored many details that would ensure that the output of our cross compiler perfectly matches the one from the Pi’s.

Extras

diff of the .text sections

< 000003d0 <.text>:
<  3d0: 0b00f04f    bleq    3c514 <abort@plt+0x3c150>
<  3d4: 0e00f04f    cdpeq   0, 0, cr15, cr0, cr15, {2}
<  3d8: 466abc02    strbtmi fp, [sl], -r2, lsl #24
<  3dc: b401b404    strlt   fp, [r1], #-1028    ; 0xfffffbfc
<  3e0: a024f8df    ldrdge  pc, [r4], -pc   ; <UNPREDICTABLE>
<  3e4: 449aa308    ldrmi   sl, [sl], #776  ; 0x308
<  3e8: c020f8df    ldrdgt  pc, [r0], -pc   ; <UNPREDICTABLE>
<  3ec: c00cf85a    andgt   pc, ip, sl, asr r8  ; <UNPREDICTABLE>
<  3f0: cd04f84d    stcgt   8, cr15, [r4, #-308]    ; 0xfffffecc
<  3f4: f85a4b06            ; <UNDEFINED> instruction: 0xf85a4b06
<  3f8: 48063003    stmdami r6, {r0, r1, ip, sp}
<  3fc: 0000f85a    andeq   pc, r0, sl, asr r8  ; <UNPREDICTABLE>
<  400: efd4f7ff    svc 0x00d4f7ff
<  404: efdef7ff    svc 0x00def7ff
<  408: 00010bf8    strdeq  r0, [r1], -r8
<  40c: 0000001c    andeq   r0, r0, ip, lsl r0
<  410: 0000002c    andeq   r0, r0, ip, lsr #32
<  414: 00000030    andeq   r0, r0, r0, lsr r0
<  418: e59f3014    ldr r3, [pc, #20]   ; 434 <abort@plt+0x70>
<  41c: e59f2014    ldr r2, [pc, #20]   ; 438 <abort@plt+0x74>
<  420: e08f3003    add r3, pc, r3
<  424: e7932002    ldr r2, [r3, r2]
<  428: e3520000    cmp r2, #0
<  42c: 012fff1e    bxeq    lr
<  430: eaffffe0    b   3b8 <__gmon_start__@plt>
<  434: 00010bd8    ldrdeq  r0, [r1], -r8
<  438: 00000028    andeq   r0, r0, r8, lsr #32
<  43c: 4b074806    blmi    1d245c <abort@plt+0x1d2098>
<  440: 4a074478    bmi 1d1628 <abort@plt+0x1d1264>
<  444: 4283447b    addmi   r4, r3, #2063597568 ; 0x7b000000
<  448: d003447a    andle   r4, r3, sl, ror r4
<  44c: 58d34b05    ldmpl   r3, {r0, r2, r8, r9, fp, lr}^
<  450: 4718b103    ldrmi   fp, [r8, -r3, lsl #2]
<  454: bf004770    svclt   0x00004770
<  458: 00010bfc    strdeq  r0, [r1], -ip
<  45c: 00010bf8    strdeq  r0, [r1], -r8
<  460: 00010bb4            ; <UNDEFINED> instruction: 0x00010bb4
<  464: 00000024    andeq   r0, r0, r4, lsr #32
<  468: 4b094808    blmi    252490 <abort@plt+0x2520cc>
<  46c: 4a094478    bmi 251654 <abort@plt+0x251290>
<  470: 1a19447b    bne 651664 <abort@plt+0x6512a0>
<  474: 1089447a    addne   r4, r9, sl, ror r4
<  478: 71d1eb01    bicsvc  lr, r1, r1, lsl #22
<  47c: d0031049    andle   r1, r3, r9, asr #32
<  480: 58d34b05    ldmpl   r3, {r0, r2, r8, r9, fp, lr}^
<  484: 4718b103    ldrmi   fp, [r8, -r3, lsl #2]
<  488: bf004770    svclt   0x00004770
<  48c: 00010bd0    ldrdeq  r0, [r1], -r0   ; <UNPREDICTABLE>
<  490: 00010bcc    andeq   r0, r1, ip, asr #23
<  494: 00010b88    andeq   r0, r1, r8, lsl #23
<  498: 00000034    andeq   r0, r0, r4, lsr r0
<  49c: 4b0ab508    blmi    2ad8c4 <abort@plt+0x2ad500>
<  4a0: 447b4a0a    ldrbtmi r4, [fp], #-2570    ; 0xfffff5f6
<  4a4: 781b447a    ldmdavc fp, {r1, r3, r4, r5, r6, sl, lr}
<  4a8: 4b09b96b    blmi    26ea5c <abort@plt+0x26e698>
<  4ac: b12358d3    ldrdlt  r5, [r3, -r3]!
<  4b0: 447b4b08    ldrbtmi r4, [fp], #-2824    ; 0xfffff4f8
<  4b4: f7ff6818            ; <UNDEFINED> instruction: 0xf7ff6818
<  4b8: f7ffef74            ; <UNDEFINED> instruction: 0xf7ffef74
<  4bc: 4b06ffbf    blmi    1c03c0 <abort@plt+0x1bfffc>
<  4c0: 447b2201    ldrbtmi r2, [fp], #-513 ; 0xfffffdff
<  4c4: bd08701a    stclt   0, cr7, [r8, #-104] ; 0xffffff98
<  4c8: 00010b9a    muleq   r1, sl, fp
<  4cc: 00010b58    andeq   r0, r1, r8, asr fp
<  4d0: 00000020    andeq   r0, r0, r0, lsr #32
<  4d4: 00010b86    andeq   r0, r1, r6, lsl #23
<  4d8: 00010b7a    andeq   r0, r1, sl, ror fp
<  4dc: bf00e7c4    svclt   0x0000e7c4
<  4e0: e52db004    push    {fp}        ; (str fp, [sp, #-4]!)
<  4e4: e28db000    add fp, sp, #0
<  4e8: e3a03000    mov r3, #0
<  4ec: e1a00003    mov r0, r3
<  4f0: e28bd000    add sp, fp, #0
<  4f4: e49db004    pop {fp}        ; (ldr fp, [sp], #4)
<  4f8: e12fff1e    bx  lr
<  4fc: 43f8e92d    mvnsmi  lr, #737280 ; 0xb4000
<  500: 4e0c4607    cfmadd32mi  mvax0, mvfx4, mvfx12, mvfx7
<  504: 4d0c4688    stcmi   6, cr4, [ip, #-544] ; 0xfffffde0
<  508: 447e4691    ldrbtmi r4, [lr], #-1681    ; 0xfffff96f
<  50c: ef38f7ff    svc 0x0038f7ff
<  510: 1b76447d    blne    1d9170c <abort@plt+0x1d91348>
<  514: d00a10b6    strhle  r1, [sl], -r6
<  518: 24003d04    strcs   r3, [r0], #-3332    ; 0xfffff2fc
<  51c: f8553401            ; <UNDEFINED> instruction: 0xf8553401
<  520: 464a3f04    strbmi  r3, [sl], -r4, lsl #30
<  524: 46384641    ldrtmi  r4, [r8], -r1, asr #12
<  528: 42a64798    adcmi   r4, r6, #152, 14    ; 0x2600000
<  52c: e8bdd1f6    pop {r1, r2, r4, r5, r6, r7, r8, ip, lr, pc}
<  530: bf0083f8    svclt   0x000083f8
<  534: 000109fe    strdeq  r0, [r1], -lr
<  538: 000109f4    strdeq  r0, [r1], -r4
<  53c: bf004770    svclt   0x00004770
---
> 000102e0 <.text>:
>    102e0: e3a0b000    mov fp, #0
>    102e4: e3a0e000    mov lr, #0
>    102e8: e49d1004    pop {r1}        ; (ldr r1, [sp], #4)
>    102ec: e1a0200d    mov r2, sp
>    102f0: e52d2004    push    {r2}        ; (str r2, [sp, #-4]!)
>    102f4: e52d0004    push    {r0}        ; (str r0, [sp, #-4]!)
>    102f8: e59fc010    ldr ip, [pc, #16]   ; 10310 <abort@plt+0x3c>
>    102fc: e52dc004    push    {ip}        ; (str ip, [sp, #-4]!)
>    10300: e59f000c    ldr r0, [pc, #12]   ; 10314 <abort@plt+0x40>
>    10304: e59f300c    ldr r3, [pc, #12]   ; 10318 <abort@plt+0x44>
>    10308: ebffffeb    bl  102bc <__libc_start_main@plt>
>    1030c: ebfffff0    bl  102d4 <abort@plt>
>    10310: 0001044c    andeq   r0, r1, ip, asr #8
>    10314: 000103d0    ldrdeq  r0, [r1], -r0   ; <UNPREDICTABLE>
>    10318: 000103ec    andeq   r0, r1, ip, ror #7
>    1031c: e59f3014    ldr r3, [pc, #20]   ; 10338 <abort@plt+0x64>
>    10320: e59f2014    ldr r2, [pc, #20]   ; 1033c <abort@plt+0x68>
>    10324: e08f3003    add r3, pc, r3
>    10328: e7932002    ldr r2, [r3, r2]
>    1032c: e3520000    cmp r2, #0
>    10330: 012fff1e    bxeq    lr
>    10334: eaffffe3    b   102c8 <__gmon_start__@plt>
>    10338: 00010cd4    ldrdeq  r0, [r1], -r4
>    1033c: 00000018    andeq   r0, r0, r8, lsl r0
>    10340: e59f0018    ldr r0, [pc, #24]   ; 10360 <abort@plt+0x8c>
>    10344: e59f3018    ldr r3, [pc, #24]   ; 10364 <abort@plt+0x90>
>    10348: e1530000    cmp r3, r0
>    1034c: 012fff1e    bxeq    lr
>    10350: e59f3010    ldr r3, [pc, #16]   ; 10368 <abort@plt+0x94>
>    10354: e3530000    cmp r3, #0
>    10358: 012fff1e    bxeq    lr
>    1035c: e12fff13    bx  r3
>    10360: 00021024    andeq   r1, r2, r4, lsr #32
>    10364: 00021024    andeq   r1, r2, r4, lsr #32
>    10368: 00000000    andeq   r0, r0, r0
>    1036c: e59f0024    ldr r0, [pc, #36]   ; 10398 <abort@plt+0xc4>
>    10370: e59f1024    ldr r1, [pc, #36]   ; 1039c <abort@plt+0xc8>
>    10374: e0411000    sub r1, r1, r0
>    10378: e1a01141    asr r1, r1, #2
>    1037c: e0811fa1    add r1, r1, r1, lsr #31
>    10380: e1b010c1    asrs    r1, r1, #1
>    10384: 012fff1e    bxeq    lr
>    10388: e59f3010    ldr r3, [pc, #16]   ; 103a0 <abort@plt+0xcc>
>    1038c: e3530000    cmp r3, #0
>    10390: 012fff1e    bxeq    lr
>    10394: e12fff13    bx  r3
>    10398: 00021024    andeq   r1, r2, r4, lsr #32
>    1039c: 00021024    andeq   r1, r2, r4, lsr #32
>    103a0: 00000000    andeq   r0, r0, r0
>    103a4: e92d4010    push    {r4, lr}
>    103a8: e59f4018    ldr r4, [pc, #24]   ; 103c8 <abort@plt+0xf4>
>    103ac: e5d43000    ldrb    r3, [r4]
>    103b0: e3530000    cmp r3, #0
>    103b4: 18bd8010    popne   {r4, pc}
>    103b8: ebffffe0    bl  10340 <abort@plt+0x6c>
>    103bc: e3a03001    mov r3, #1
>    103c0: e5c43000    strb    r3, [r4]
>    103c4: e8bd8010    pop {r4, pc}
>    103c8: 00021024    andeq   r1, r2, r4, lsr #32
>    103cc: eaffffe6    b   1036c <abort@plt+0x98>
>    103d0: e52db004    push    {fp}        ; (str fp, [sp, #-4]!)
>    103d4: e28db000    add fp, sp, #0
>    103d8: e3a03000    mov r3, #0
>    103dc: e1a00003    mov r0, r3
>    103e0: e28bd000    add sp, fp, #0
>    103e4: e49db004    pop {fp}        ; (ldr fp, [sp], #4)
>    103e8: e12fff1e    bx  lr
>    103ec: e92d47f0    push    {r4, r5, r6, r7, r8, r9, sl, lr}
>    103f0: e1a07000    mov r7, r0
>    103f4: e59f6048    ldr r6, [pc, #72]   ; 10444 <abort@plt+0x170>
>    103f8: e59f5048    ldr r5, [pc, #72]   ; 10448 <abort@plt+0x174>
>    103fc: e08f6006    add r6, pc, r6
>    10400: e08f5005    add r5, pc, r5
>    10404: e0466005    sub r6, r6, r5
>    10408: e1a08001    mov r8, r1
>    1040c: e1a09002    mov r9, r2
>    10410: ebffffa1    bl  1029c <__libc_start_main@plt-0x20>
>    10414: e1b06146    asrs    r6, r6, #2
>    10418: 08bd87f0    popeq   {r4, r5, r6, r7, r8, r9, sl, pc}
>    1041c: e3a04000    mov r4, #0
>    10420: e2844001    add r4, r4, #1
>    10424: e4953004    ldr r3, [r5], #4
>    10428: e1a02009    mov r2, r9
>    1042c: e1a01008    mov r1, r8
>    10430: e1a00007    mov r0, r7
>    10434: e12fff33    blx r3
>    10438: e1560004    cmp r6, r4
>    1043c: 1afffff7    bne 10420 <abort@plt+0x14c>
>    10440: e8bd87f0    pop {r4, r5, r6, r7, r8, r9, sl, pc}
>    10444: 00010b10    andeq   r0, r1, r0, lsl fp
>    10448: 00010b08    andeq   r0, r1, r8, lsl #22
>    1044c: e12fff1e    bx  lr