unscripted. home blog

Tiniest hello world

10/15/2024

What’s the goal ?

Trying to output the smallest binary for a C Hello World.

Why ?

For fun and learning, other quesions ?

What does the hello world do ?

Write “Hello World!” to the standard output.

Environment, platform, architecture ?

Platform: Linux x86_64
Compiler: gcc
Executable format: ELF 64-bit

How do we measure the size of the output binary ?

Some would measure the size by the number of opcodes in the resulting binary.

For simplicity we will measure the resulting file size of the binary:
stat -c %s a.out

This command will provide the size of the file in bytes.
Note that there are other ways to get the file size but it is imho the most simple and convenient.

All code used here is available in my github repo.

Let’s start with the base code (01):

#include <stdio.h>

int main () {
    printf("Hello World!");
}

Pretty simple right ?
But this simple 5 lines code is compiled to a 15448 bytes binary.

That’s waaaaaaaaay to much for us.
Let’s find ways to shrink that.

gcc optimizations

gcc has some optimization flags that you can use to increase program speed, reduce it’s size…

You can check what’s available for your version of gcc with gcc -Q --help=optimizers.
I tried many options like -0s and -0z but I didn’t find any changes in output size.

Time to try something else.

Replacing printf by a lower level function (02)

What if we replaced the printf call by the write syscall ?
Would it change anything ?
Let’s see.

#include <unistd.h>

int main () {
    write(1, "Hello World!", 12);
}

Binary size: 15448 bytes
No changes, but we will keep the write version for the next steps.

Get rid of the main function and the C runtime (03)

Did you know that you could write a C program without a main function ?
Let’s see how this works.

#include <unistd.h>

void _start () {
    write(1, "Hello World!", 12);
    _exit(0);
}

Okay so what happened there ?
_start is actually the entrypoint of a program run on Linux.
The _start function provided by C:

Note that it is MY understanding of how things work and it is not an exhaustive list.

What changed in the code ?

We repaced the main function by the _start function.
We had to explicitly call _exit to terminate the program (it segfaults if we don’t).
Note that we could use exit instead of _exit, the binary size is the same.

Time to compile

/usr/bin/ld: /tmp/cc2wHwD6.o: in function `_start':
03_hello.c:(.text+0x0): multiple definition of `_start'; /usr/lib/gcc/x86_64-pc-linux-gnu/14.2.1/../../../../lib/Scrt1.o:(.text+0x0): first defined here
/usr/bin/ld: /usr/lib/gcc/x86_64-pc-linux-gnu/14.2.1/../../../../lib/Scrt1.o: in function `_start':
(.text+0x1b): undefined reference to `main'
collect2: error: ld returned 1 exit status

UH-OH what is going on ?

The linker says that

Indeed we forgot to tell gcc to NOT use the C _start function (which expects a main function to be found).
Let’s change this: gcc -nostartfiles <YOUR_C_FILE>

Recompiling…and YES we got our new binary !

Do not forget to test it before getting its size, we want a small binary but above all a WORKING binary :)

Binary size: 14296 bytes
YES ! Finally some improvement. I know, I know it’s not that much but we can do better.

Get rid of the libc (04)

This one will be tough.
Libc implements Linux system calls, provides convenient methods to deal with strings, memory…
How do we talk to the OS without the libc ?

We need assembly.

#include <unistd.h>
#include <stdint.h>

void _start () {
    asm (
        "movq $1, %%rax;"
        "movq %0, %%rdi;"
        "movq %1, %%rsi;"
        "movq %2, %%rdx;"
        "syscall;"
        :
        : "r"((uint64_t)1), "r"((const char*)"Hello World!"), "r"((size_t)12)
        : "%rax", "%rdi", "%rsi", "%rdx"
    );

    asm (
        "movq  $60, %%rax;"
        "movq $0, %%rdi;"
        "syscall;"
        :
        :
        : "%rax", "%rdi"

    );
}

We replaced our system calls with the equivalent assembly code.
Since we don’t need libc anymore we can compile like this:
gcc -nostdlib <C_FILE>.

Notes:
The asm keyword is a gnu extension, it might not work with other compilers, use __asm__ if needed.
That syntax is called extended asm, we can read C variables from assembly, look at %0, %1 and %2.
For more information about this, look at gcc asm doc.

Binary size: 13784 bytes
Ok so we reduced the size a bit.
But it should not be that big, where do those kilobytes come from ?

The next steps will be heavily inspired by this blog post that I recommend reading.

Actually most of those bytes come from linking and ELF (Executable and Linkable Format) sections.

For now we got rid of libc, but we still rely on the C language.
Let’s get rid of C, farewell my friend.

Get rid of C (05)

That means we cannot use our .c file anymore.
Luckily we already have a good amount of assembly already written in the previous step.

global _start

section .data
	msg:	db "Hello world!"

section .text

_start:
    mov rax, 1
    mov rdi, 1
    mov rsi, msg
    mov rdx, 12
    syscall

    mov  rax, 60
    mov rdi, 0
    syscall

Some things to notice here:

This is to comply with nasm syntax.

Finally, the build command: nasm -f elf64 05_hello.asm && gcc hello.o -o a.out -no-pie

First, we generate an object file with nasm and our assembly code.
Second, we still use gcc to link the object file (though we don’t need gcc anymore).

Binary size: 8928 bytes
Nice, almost 5k saved !

Get rid of gcc

Since we do not have any C code we don’t need gcc, let’s link our executable ourselves.
nasm -f elf64 05_hello.asm && ld -m elf_x86_64 05_hello.o -o a.out

Binary size: 8848 bytes
Not much of an improvement but it’s still better than before.

What’s next ?
You have seen many occurences of ELF, elf64… let’s dive in :)

ELF optimization (06)

We don’t need symbols and relocation information since we don’t use any functions or dynamically linked libs.
To remove them use strip -s ./a.out.
For now on, I will strip symbols after every a.out generation.

Binary size: 8488 bytes

An ELF file has different sections.
By examinating them (i.e with readelf), .data seems to take a lot of space, let’s remove it.

global _start

section .text

_start:
    mov rax, 1
    mov rdi, 1
    mov rsi, msg
    mov rdx, 12
    syscall

    mov  rax, 60
    mov rdi, 0
    syscall

msg:	db "Hello world!"

Binary size: 4360 bytes
Bingo ! The size got divided by 2.

Can we shrink MORE ?
Let’s look at the bytes contained in our binary now.
You can use a tool like xxd to get an hexdump.

00000000: 7f45 4c46 0201 0100 0000 0000 0000 0000  .ELF............
00000010: 0200 3e00 0100 0000 0010 4000 0000 0000  ..>.......@.....
00000020: 4000 0000 0000 0000 4810 0000 0000 0000  @.......H.......
00000030: 0000 0000 4000 3800 0200 4000 0300 0200  [email protected]...@.....
00000040: 0100 0000 0400 0000 0000 0000 0000 0000  ................
00000050: 0000 4000 0000 0000 0000 4000 0000 0000  ..@.......@.....
00000060: b000 0000 0000 0000 b000 0000 0000 0000  ................
00000070: 0010 0000 0000 0000 0100 0000 0500 0000  ................
00000080: 0010 0000 0000 0000 0010 4000 0000 0000  ..........@.....
00000090: 0010 4000 0000 0000 3300 0000 0000 0000  [email protected].......
000000a0: 3300 0000 0000 0000 0010 0000 0000 0000  3...............
000000b0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
000000c0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
000000d0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
000000e0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
000000f0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
...
00000ff0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00001000: b801 0000 00bf 0100 0000 48be 2710 4000  ..........H.'.@.
00001010: 0000 0000 ba0c 0000 000f 05b8 3c00 0000  ............<...
00001020: bf00 0000 000f 0548 656c 6c6f 2077 6f72  .......Hello wor
00001030: 6c64 2100 2e73 6873 7472 7461 6200 2e74  ld!..shstrtab..t
00001040: 6578 7400 0000 0000 0000 0000 0000 0000  ext.............
00001050: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00001060: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00001070: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00001080: 0000 0000 0000 0000 0b00 0000 0100 0000  ................
00001090: 0600 0000 0000 0000 0010 4000 0000 0000  ..........@.....
000010a0: 0010 0000 0000 0000 3300 0000 0000 0000  ........3.......
000010b0: 0000 0000 0000 0000 1000 0000 0000 0000  ................
000010c0: 0000 0000 0000 0000 0100 0000 0300 0000  ................
000010d0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
000010e0: 3310 0000 0000 0000 1100 0000 0000 0000  3...............
000010f0: 0000 0000 0000 0000 0100 0000 0000 0000  ................
00001100: 0000 0000 0000 0000                      ........

Notice the bytes between 0xf0 to 0xff0, they are all zeros, we should find a way to remove them.
It’s time to manually build our binary, without the assembler or the linker.

What do we need ?

ELF header + Program Header Table + program code.
Note that in an usual ELF exe you would find symbols, relocation code and many sections for different parts of your program (code, readonly data, uninitialized data, initialized data…).

ELF header

64 bytes
Needed for the OS to understand what kind of ELF file it is.
readelf -h ./a.out
This command shows that

So, to get our first part of the file, we need to get the first 120 bytes of our binary (64 bytes ELF header + 56 bytes program header table).
head -c 120 ./a.out

Now our actual code
Let’s see where our code is (.text section) and how long it is.
readelf -S ./a.out

There are 3 section headers, starting at offset 0x1048:

Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [ 0]                   NULL             0000000000000000  00000000
       0000000000000000  0000000000000000           0     0     0
  [ 1] .text             PROGBITS         0000000000401000  00001000
       0000000000000033  0000000000000000  AX       0     0     16
  [ 2] .shstrtab         STRTAB           0000000000000000  00001033
       0000000000000011  0000000000000000           0     0     1

If we look at the .text section, we can see its size 0x33 (53 bytes) and its offset in the file 0x1000 (4096).

Okay how to dump that part of the file ?

You could use many tools like tail, head, xxd or even dd.
Here is a simple example with tail and head:

tail -c +4097 ./a.out | head -c 53

The trick here is to use tail’s +X syntax, so tail reads from the byte X until the end of the file. Note the 4097 instead of 4096, because tails counts from 1, not 0.
Then we pipe that output to head to only get the 1st 53 bytes, our code :)

Now that we have the ELF header, the program header table and the code, let’s concatenate everything.

cat <(head -c 120 ./a.out) <(tail -c +4097 ./a.out | head -c 53) > ./a2.out

Some details about this command:
cat concatenates files passed as args to its stdout.
Here we don’t have file names but outputs of 2 commands.
With bash’s process substitution ”<(list)”, cat actually sees temporary file descriptors created by bash, corresponding to the output of the command list”.

Do not forget to set your new file executable: chmod +x ./a2.out

Now our small binary won’t work. Why ?

We moved the exec part in a different offset in the file, which does not match with the information stored in the ELF header / program header table.
Check yourselves with readelf -a ./a2.out

ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 
  Class:                             ELF64
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              EXEC (Executable file)
  Machine:                           Advanced Micro Devices X86-64
  Version:                           0x1
  Entry point address:               0x401000
  Start of program headers:          64 (bytes into file)
  Start of section headers:          4168 (bytes into file)
  Flags:                             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           56 (bytes)
  Number of program headers:         2
  Size of section headers:           64 (bytes)
  Number of section headers:         3
  Section header string table index: 2
readelf: Error: Reading 192 bytes extends past end of file for section headers
readelf: Error: Section headers are not available!
readelf: Error: Reading 112 bytes extends past end of file for program headers

We need to manually modify some bytes of the binary with an hex editor.

Modify the ELF

To modify the ELF we need to understand which bytes of the ELF must be changed.
See https://man7.org/linux/man-pages/man5/elf.5.html for a description of the ELF header.

Entrypoint address

As we saw with readelf, the current entrypoint address is wrong 0x401000.
It should be 0x400078 because our code is at position 120 (0x78) in the file.

Entrypoint address is the 4th field of the ELF, starting at the 24th byte (0x18) and 64bit long (8 bytes) (for my architecture x86_64)

In my editor, I can see 00 10 40 00 00 00 00 00. My architecture is little-endian, so 0x00 is the least significant byte, then goes 0x10, etc…
We should have 78 00 40 00 00 00 00 00

Start of section header

Originally that was 4168 (0x1048) but we don’t have a section header anymore so it should be 0.
Start of section header is the 7th field of the ELF header, starting at the 40th byte (0x28) and 8 bytes long.

Number of program headers

Originally 2, but we have only 1 program header.
Nb of program headers is the 11th field of the ELF header, starting at the 56th byte (0x38), 2 bytes long.

Number of section headers

Originally 3 but we now have 0 section header.
Nb of section headers is the 13th field of the ELF header, starting at the 60th byte (0x3C), 2 bytes long.

Section header string table index

Originally 2 but we now have 0 sections, this should be 0.
It’s the last field of the ELF header, starting at the 62th byte (0x3E), 2 bytes long.

Now that we are done with the ELF header, some changes must be made to the program header table.

Program header table

The offset from the beginning of the file at which the first byte of the segment resides changed.
0x1000 -> 0x78
It’s the 3rd field of the program header, 72th byte (0x48)

The virtual address of the program has changed

Originally 0x400000, now 0x400078.
It’s the 4th field of the program header, since the program header follows the ELF header, it’s at the 80th byte (0x50), 8 bytes long.

Originally 0x004 which means read-only, we need at least read and exec for our code so 0x005.
It’s the 2nd field of the program header (for x86_64 arch), 68th byte (0x44), 4 bytes long.

Relocate the Hello World! string

Originally 0x0000000000401027 (8 bytes on a 64bit arch), it should be 0x000000000040009f.
Since we moved our .text section with our custom string, the call to write(1) refers to a wrong address.

We have now a functional binary (check it yourselves ;)

Final binary size: 171 bytes

If you checked https://ech0.re/building-the-smallest-elf-program/, you can see that we have one less byte than him.
It’s because our “Hello world!” string does not contain the line feed (\n).

That’s all folks !

Final binary:

7f454c4602010100000000000000000002003e0001000000780040000000
000040000000000000000000000000000000000000004000380001000000
000000000100000005000000780000000000000078004000000000000000
400000000000b000000000000000b0000000000000000010000000000000
b801000000bf0100000048be9f00400000000000ba0c0000000f05b83c00
0000bf000000000f0548656c6c6f20776f726c6421