Cross Compiling Raspberry Pi Code
The Raspberry Pi is an amazing computer. With this little device you can build and run C code, play with Python, connect to external hardware, hack Linux, create a home server, power robots, and do many more things. When I first started with Raspberry Pi, I didn’t know that I could write my source code from the comfort of my laptop, compile it, and then transfer the executable to the Pi. On the one hand, it’s great that you can run GCC directly on a Raspberry: if you have an idea you can quickly prototype it without having to worry about setting up a build environment. However, if you’re working on anything that will take longer than an afternoon to complete, you’ll probably want to use a fast PC that runs your favorite text editor, that lets you have multiple web browser windows open, and most importantly, that will compile your code quickly. To do that, you can set up a cross compiling container that can output an executable for your Raspberry Pi.
If you just want to jump straight to the do and don’t worry too much about the how, you can check out my GitHub repository for this project.
Stating our goal
Let’s say we have just installed the latest Raspberry Pi OS on a brand new Pi. At the time of writing, that would be the August 2020 version running on a Raspberry Pi 4 model B. As soon as you boot your Pi for the first time, you’ll have GCC 8.3.0 installed on it. By default, this version of GCC compiles code for ARMv6. The goal here is to install the appropriate compiler in a container to generate the same executable as you would if you compiled your code in a Raspberry Pi.
Building the container
I’ll use Docker to containerize our build environment. The Raspberry Pi Foundation recommends Ubuntu to cross compile a kernel for Raspberry Pi, so I figured that would be my starting point. However, after a lot of trial and error, I realized it would be easier if I used Debian; Raspberry Pi OS is much closer to Debian than it is to Ubuntu, and it seems to be easier to get the right version of the software I need to install on Debian, so that’s what I’ll show here.
At the bare minimum, we will need to install gcc-arm-linux-gnueabihf
and binutils-arm-linux-gnueabi
. gcc-arm-linux-gnueabihf
is the GNU C cross-compiler for the armhf architecture. We can double check that the Raspberry Pi 4’s architecture is that one by typing dpkg --print-architecture
in it. I’ll also download the same version of GCC as the one that comes with Raspberry Pi OS, gcc-8-arm-linux-gnueabihf
. binutils-arm-linux-gnueabi
is necessary to cross-compile arm-linux-gnueabi programs.
Our Docker container will look something like this:
FROM debian:10-slim
RUN apt-get update \
&& apt-get -y install \
make \
libc6=2.28-10 \
gcc-8-arm-linux-gnueabihf \
binutils-arm-linux-gnueabihf
I specifically chose Debian 10 to match the version of Raspberry Pi OS that I’m running, which is based on Debian 10 (also known as Debian buster). The Pi is running GCC 8.3.0-6+rpi1
and Binutils 2.31.1
and comes with GLIBC 2.28-10+rpi1
. The Debian container has GCC Debian 8.3.0-2
and Binutils 2.31.1-16
installed, so all in all, it seems pretty close to what the Pi has.
Testing the cross compiler
To test that it works, I’ll compile an empty C program, which only contains the following lines:
int main() {}
To make things easier, I’ll write a short Makefile to build all the artifacts:
.PHONY: clean
assembly:
@mkdir -p build
arm-linux-gnueabihf-gcc-8 \
-S -march=armv6 -marm -mfpu=vfp -masm-syntax-unified \
main.c \
-o build/main.s \
-O0
object:
@mkdir -p build
arm-linux-gnueabihf-gcc-8 \
-march=armv6 -marm -mfpu=vfp \
-c build/main.s \
-o build/main.o \
-O0
executable:
@mkdir -p build
arm-linux-gnueabihf-gcc-8 \
-march=armv6 -marm -mfpu=vfp \
build/main.s \
-o build/main \
-O0
clean:
rm -f build/*
By default, the compiler that we have installed in our Debian container will compile software for ARMv7, but we want to build it for AMRv6 (just to match the default architecture in the GCC compiler that comes with the Raspberry Pi). If you choose to build for ARMv6, the compiler will try to produce Thumb instructions, but this version of Thumb will not be compatible with ARMv6. Therefore, we must choose to generate code that executes in ARM state with -marm
.
After examining the assembly code generated by the Raspberry Pi and the Debian compiler, I realized that they also use different versions of the Vector Floating Point, so I added the -mfpu=vfp
option to the Makefile, so that Debian would build code for the same VFP version. If you want to see all of GCC’s default options you can type arm-linux-gnueabihf-gcc-8 -v
to check them out.
Additionally, I chose to use Unified Assembly Language (UAL) for our assembly code with the -masm-syntax-unified
flag. For this example however, it shouldn’t make a difference.
Finally, since I intend to compare the assembly code generated by the Raspberry Pi’s compiler and the cross compiler, I added an option to disable all compiler optimizations with -O0
. That might allow me to do the analysis without worrying about the compiler obfuscating the original intent of the source code.
Comparing the assembly code
The cross compiler generates this assembly code:
.arch armv6
.eabi_attribute 28, 1
.eabi_attribute 20, 1
.eabi_attribute 21, 1
.eabi_attribute 23, 3
.eabi_attribute 24, 1
.eabi_attribute 25, 1
.eabi_attribute 26, 2
.eabi_attribute 30, 6
.eabi_attribute 34, 1
.eabi_attribute 18, 4
.file "main.c"
.text
.align 2
.global main
.arch armv6
.syntax unified
.arm
.fpu vfp
.type main, %function
main:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 1, uses_anonymous_args = 0
@ link register save eliminated.
str fp, [sp, #-4]!
add fp, sp, #0
mov r3, #0
mov r0, r3
add sp, fp, #0
@ sp needed
ldr fp, [sp], #4
bx lr
.size main, .-main
.ident "GCC: (Debian 8.3.0-2) 8.3.0"
.section .note.GNU-stack,"",%progbits
The assembly generated by the Raspberry Pi only differs on the second-to-last line:
< .ident "GCC: (Debian 8.3.0-2) 8.3.0"
---
> .ident "GCC: (Raspbian 8.3.0-6+rpi1) 8.3.0"
The .ident
directive will only place a tag on the object file, so in terms of functionality, the assembly generated by the cross compiler is the same as the one compiled by the Raspberry Pi.
Comparing the object files
Before doing any linking, it’s good to take a look at the main.o object file. Since the source assembly is (nearly) the same, so are the main.o
files generated by the Pi and the cross compiler. A way to compare the object files is with objdump --disassemble --full-contents
.
In our example, the only difference is in the .comment
section, and the comment comes straight from the information from the .ident
directive:
< 0000 00474343 3a202844 65626961 6e20382e .GCC: (Debian 8.
< 0010 332e302d 32292038 2e332e30 00 3.0-2) 8.3.0.
---
> 0000 00474343 3a202852 61737062 69616e20 .GCC: (Raspbian
> 0010 382e332e 302d362b 72706931 2920382e 8.3.0-6+rpi1) 8.
> 0020 332e3000 3.0.
Analyzing the executable files
Although our cross-compiled main.o
file is the same as the one compiled by the Pi, the final executable is not. The Debian container produces an 8108 byte executable, whereas the Pi makes a 7912 byte binary. Stripping the files results in a 5564 and a 5540 byte executable, so it’s clear that there must be some differences beyond the symbols in the object files.
After disassembling the stripped executable files, I can see that there many differences between the both of them. Just the .text
section is full of differences (I’ve included them at the end of this blog post for any assembly nerds ;) ). However, one chunk that is the same in both files is the one that corresponds to the main
function in the original assembly code:
push {fp} ; (str fp, [sp, #-4]!)
add fp, sp, #0
mov r3, #0
mov r0, r3
add sp, fp, #0
pop {fp} ; (ldr fp, [sp], #4)
bx lr
During the linking stage, different instructions are stitched together, but the cross compiler picked different instructions than the Pi’s compiler. While I was building the Docker container, I chose a Linux distribution and packages that were “close enough” to the ones found on the Raspberry Pi. Clearly, close enough is not enough to get the exact same executables, even though they end up producing the same result.
Closing thoughts
My original goal was to cross compile code for the Raspberry Pi and get the same executable as the Pi would build. To do this, I installed a cross compiler, assembler, linker and binary utilities for ARM, plus GLIBC. While the executable does work, it is different than the reference one, so in a future post I will explore how other people have put together a system to build Raspberry Pi software. I suspect that I have ignored many details that would ensure that the output of our cross compiler perfectly matches the one from the Pi’s.
Extras
diff of the .text sections
< 000003d0 <.text>:
< 3d0: 0b00f04f bleq 3c514 <abort@plt+0x3c150>
< 3d4: 0e00f04f cdpeq 0, 0, cr15, cr0, cr15, {2}
< 3d8: 466abc02 strbtmi fp, [sl], -r2, lsl #24
< 3dc: b401b404 strlt fp, [r1], #-1028 ; 0xfffffbfc
< 3e0: a024f8df ldrdge pc, [r4], -pc ; <UNPREDICTABLE>
< 3e4: 449aa308 ldrmi sl, [sl], #776 ; 0x308
< 3e8: c020f8df ldrdgt pc, [r0], -pc ; <UNPREDICTABLE>
< 3ec: c00cf85a andgt pc, ip, sl, asr r8 ; <UNPREDICTABLE>
< 3f0: cd04f84d stcgt 8, cr15, [r4, #-308] ; 0xfffffecc
< 3f4: f85a4b06 ; <UNDEFINED> instruction: 0xf85a4b06
< 3f8: 48063003 stmdami r6, {r0, r1, ip, sp}
< 3fc: 0000f85a andeq pc, r0, sl, asr r8 ; <UNPREDICTABLE>
< 400: efd4f7ff svc 0x00d4f7ff
< 404: efdef7ff svc 0x00def7ff
< 408: 00010bf8 strdeq r0, [r1], -r8
< 40c: 0000001c andeq r0, r0, ip, lsl r0
< 410: 0000002c andeq r0, r0, ip, lsr #32
< 414: 00000030 andeq r0, r0, r0, lsr r0
< 418: e59f3014 ldr r3, [pc, #20] ; 434 <abort@plt+0x70>
< 41c: e59f2014 ldr r2, [pc, #20] ; 438 <abort@plt+0x74>
< 420: e08f3003 add r3, pc, r3
< 424: e7932002 ldr r2, [r3, r2]
< 428: e3520000 cmp r2, #0
< 42c: 012fff1e bxeq lr
< 430: eaffffe0 b 3b8 <__gmon_start__@plt>
< 434: 00010bd8 ldrdeq r0, [r1], -r8
< 438: 00000028 andeq r0, r0, r8, lsr #32
< 43c: 4b074806 blmi 1d245c <abort@plt+0x1d2098>
< 440: 4a074478 bmi 1d1628 <abort@plt+0x1d1264>
< 444: 4283447b addmi r4, r3, #2063597568 ; 0x7b000000
< 448: d003447a andle r4, r3, sl, ror r4
< 44c: 58d34b05 ldmpl r3, {r0, r2, r8, r9, fp, lr}^
< 450: 4718b103 ldrmi fp, [r8, -r3, lsl #2]
< 454: bf004770 svclt 0x00004770
< 458: 00010bfc strdeq r0, [r1], -ip
< 45c: 00010bf8 strdeq r0, [r1], -r8
< 460: 00010bb4 ; <UNDEFINED> instruction: 0x00010bb4
< 464: 00000024 andeq r0, r0, r4, lsr #32
< 468: 4b094808 blmi 252490 <abort@plt+0x2520cc>
< 46c: 4a094478 bmi 251654 <abort@plt+0x251290>
< 470: 1a19447b bne 651664 <abort@plt+0x6512a0>
< 474: 1089447a addne r4, r9, sl, ror r4
< 478: 71d1eb01 bicsvc lr, r1, r1, lsl #22
< 47c: d0031049 andle r1, r3, r9, asr #32
< 480: 58d34b05 ldmpl r3, {r0, r2, r8, r9, fp, lr}^
< 484: 4718b103 ldrmi fp, [r8, -r3, lsl #2]
< 488: bf004770 svclt 0x00004770
< 48c: 00010bd0 ldrdeq r0, [r1], -r0 ; <UNPREDICTABLE>
< 490: 00010bcc andeq r0, r1, ip, asr #23
< 494: 00010b88 andeq r0, r1, r8, lsl #23
< 498: 00000034 andeq r0, r0, r4, lsr r0
< 49c: 4b0ab508 blmi 2ad8c4 <abort@plt+0x2ad500>
< 4a0: 447b4a0a ldrbtmi r4, [fp], #-2570 ; 0xfffff5f6
< 4a4: 781b447a ldmdavc fp, {r1, r3, r4, r5, r6, sl, lr}
< 4a8: 4b09b96b blmi 26ea5c <abort@plt+0x26e698>
< 4ac: b12358d3 ldrdlt r5, [r3, -r3]!
< 4b0: 447b4b08 ldrbtmi r4, [fp], #-2824 ; 0xfffff4f8
< 4b4: f7ff6818 ; <UNDEFINED> instruction: 0xf7ff6818
< 4b8: f7ffef74 ; <UNDEFINED> instruction: 0xf7ffef74
< 4bc: 4b06ffbf blmi 1c03c0 <abort@plt+0x1bfffc>
< 4c0: 447b2201 ldrbtmi r2, [fp], #-513 ; 0xfffffdff
< 4c4: bd08701a stclt 0, cr7, [r8, #-104] ; 0xffffff98
< 4c8: 00010b9a muleq r1, sl, fp
< 4cc: 00010b58 andeq r0, r1, r8, asr fp
< 4d0: 00000020 andeq r0, r0, r0, lsr #32
< 4d4: 00010b86 andeq r0, r1, r6, lsl #23
< 4d8: 00010b7a andeq r0, r1, sl, ror fp
< 4dc: bf00e7c4 svclt 0x0000e7c4
< 4e0: e52db004 push {fp} ; (str fp, [sp, #-4]!)
< 4e4: e28db000 add fp, sp, #0
< 4e8: e3a03000 mov r3, #0
< 4ec: e1a00003 mov r0, r3
< 4f0: e28bd000 add sp, fp, #0
< 4f4: e49db004 pop {fp} ; (ldr fp, [sp], #4)
< 4f8: e12fff1e bx lr
< 4fc: 43f8e92d mvnsmi lr, #737280 ; 0xb4000
< 500: 4e0c4607 cfmadd32mi mvax0, mvfx4, mvfx12, mvfx7
< 504: 4d0c4688 stcmi 6, cr4, [ip, #-544] ; 0xfffffde0
< 508: 447e4691 ldrbtmi r4, [lr], #-1681 ; 0xfffff96f
< 50c: ef38f7ff svc 0x0038f7ff
< 510: 1b76447d blne 1d9170c <abort@plt+0x1d91348>
< 514: d00a10b6 strhle r1, [sl], -r6
< 518: 24003d04 strcs r3, [r0], #-3332 ; 0xfffff2fc
< 51c: f8553401 ; <UNDEFINED> instruction: 0xf8553401
< 520: 464a3f04 strbmi r3, [sl], -r4, lsl #30
< 524: 46384641 ldrtmi r4, [r8], -r1, asr #12
< 528: 42a64798 adcmi r4, r6, #152, 14 ; 0x2600000
< 52c: e8bdd1f6 pop {r1, r2, r4, r5, r6, r7, r8, ip, lr, pc}
< 530: bf0083f8 svclt 0x000083f8
< 534: 000109fe strdeq r0, [r1], -lr
< 538: 000109f4 strdeq r0, [r1], -r4
< 53c: bf004770 svclt 0x00004770
---
> 000102e0 <.text>:
> 102e0: e3a0b000 mov fp, #0
> 102e4: e3a0e000 mov lr, #0
> 102e8: e49d1004 pop {r1} ; (ldr r1, [sp], #4)
> 102ec: e1a0200d mov r2, sp
> 102f0: e52d2004 push {r2} ; (str r2, [sp, #-4]!)
> 102f4: e52d0004 push {r0} ; (str r0, [sp, #-4]!)
> 102f8: e59fc010 ldr ip, [pc, #16] ; 10310 <abort@plt+0x3c>
> 102fc: e52dc004 push {ip} ; (str ip, [sp, #-4]!)
> 10300: e59f000c ldr r0, [pc, #12] ; 10314 <abort@plt+0x40>
> 10304: e59f300c ldr r3, [pc, #12] ; 10318 <abort@plt+0x44>
> 10308: ebffffeb bl 102bc <__libc_start_main@plt>
> 1030c: ebfffff0 bl 102d4 <abort@plt>
> 10310: 0001044c andeq r0, r1, ip, asr #8
> 10314: 000103d0 ldrdeq r0, [r1], -r0 ; <UNPREDICTABLE>
> 10318: 000103ec andeq r0, r1, ip, ror #7
> 1031c: e59f3014 ldr r3, [pc, #20] ; 10338 <abort@plt+0x64>
> 10320: e59f2014 ldr r2, [pc, #20] ; 1033c <abort@plt+0x68>
> 10324: e08f3003 add r3, pc, r3
> 10328: e7932002 ldr r2, [r3, r2]
> 1032c: e3520000 cmp r2, #0
> 10330: 012fff1e bxeq lr
> 10334: eaffffe3 b 102c8 <__gmon_start__@plt>
> 10338: 00010cd4 ldrdeq r0, [r1], -r4
> 1033c: 00000018 andeq r0, r0, r8, lsl r0
> 10340: e59f0018 ldr r0, [pc, #24] ; 10360 <abort@plt+0x8c>
> 10344: e59f3018 ldr r3, [pc, #24] ; 10364 <abort@plt+0x90>
> 10348: e1530000 cmp r3, r0
> 1034c: 012fff1e bxeq lr
> 10350: e59f3010 ldr r3, [pc, #16] ; 10368 <abort@plt+0x94>
> 10354: e3530000 cmp r3, #0
> 10358: 012fff1e bxeq lr
> 1035c: e12fff13 bx r3
> 10360: 00021024 andeq r1, r2, r4, lsr #32
> 10364: 00021024 andeq r1, r2, r4, lsr #32
> 10368: 00000000 andeq r0, r0, r0
> 1036c: e59f0024 ldr r0, [pc, #36] ; 10398 <abort@plt+0xc4>
> 10370: e59f1024 ldr r1, [pc, #36] ; 1039c <abort@plt+0xc8>
> 10374: e0411000 sub r1, r1, r0
> 10378: e1a01141 asr r1, r1, #2
> 1037c: e0811fa1 add r1, r1, r1, lsr #31
> 10380: e1b010c1 asrs r1, r1, #1
> 10384: 012fff1e bxeq lr
> 10388: e59f3010 ldr r3, [pc, #16] ; 103a0 <abort@plt+0xcc>
> 1038c: e3530000 cmp r3, #0
> 10390: 012fff1e bxeq lr
> 10394: e12fff13 bx r3
> 10398: 00021024 andeq r1, r2, r4, lsr #32
> 1039c: 00021024 andeq r1, r2, r4, lsr #32
> 103a0: 00000000 andeq r0, r0, r0
> 103a4: e92d4010 push {r4, lr}
> 103a8: e59f4018 ldr r4, [pc, #24] ; 103c8 <abort@plt+0xf4>
> 103ac: e5d43000 ldrb r3, [r4]
> 103b0: e3530000 cmp r3, #0
> 103b4: 18bd8010 popne {r4, pc}
> 103b8: ebffffe0 bl 10340 <abort@plt+0x6c>
> 103bc: e3a03001 mov r3, #1
> 103c0: e5c43000 strb r3, [r4]
> 103c4: e8bd8010 pop {r4, pc}
> 103c8: 00021024 andeq r1, r2, r4, lsr #32
> 103cc: eaffffe6 b 1036c <abort@plt+0x98>
> 103d0: e52db004 push {fp} ; (str fp, [sp, #-4]!)
> 103d4: e28db000 add fp, sp, #0
> 103d8: e3a03000 mov r3, #0
> 103dc: e1a00003 mov r0, r3
> 103e0: e28bd000 add sp, fp, #0
> 103e4: e49db004 pop {fp} ; (ldr fp, [sp], #4)
> 103e8: e12fff1e bx lr
> 103ec: e92d47f0 push {r4, r5, r6, r7, r8, r9, sl, lr}
> 103f0: e1a07000 mov r7, r0
> 103f4: e59f6048 ldr r6, [pc, #72] ; 10444 <abort@plt+0x170>
> 103f8: e59f5048 ldr r5, [pc, #72] ; 10448 <abort@plt+0x174>
> 103fc: e08f6006 add r6, pc, r6
> 10400: e08f5005 add r5, pc, r5
> 10404: e0466005 sub r6, r6, r5
> 10408: e1a08001 mov r8, r1
> 1040c: e1a09002 mov r9, r2
> 10410: ebffffa1 bl 1029c <__libc_start_main@plt-0x20>
> 10414: e1b06146 asrs r6, r6, #2
> 10418: 08bd87f0 popeq {r4, r5, r6, r7, r8, r9, sl, pc}
> 1041c: e3a04000 mov r4, #0
> 10420: e2844001 add r4, r4, #1
> 10424: e4953004 ldr r3, [r5], #4
> 10428: e1a02009 mov r2, r9
> 1042c: e1a01008 mov r1, r8
> 10430: e1a00007 mov r0, r7
> 10434: e12fff33 blx r3
> 10438: e1560004 cmp r6, r4
> 1043c: 1afffff7 bne 10420 <abort@plt+0x14c>
> 10440: e8bd87f0 pop {r4, r5, r6, r7, r8, r9, sl, pc}
> 10444: 00010b10 andeq r0, r1, r0, lsl fp
> 10448: 00010b08 andeq r0, r1, r8, lsl #22
> 1044c: e12fff1e bx lr