Any sufficiently advanced technology is indistinguishable from magic. | |
Arthur C. Clarke |
For the first example I'll present the simplest piece of code that still gives sufficient feedback. Our aim is to implant it into /bin/sh. [1] On practically every recent installation of SunOS/sparc the following code will emit three magic letters instead of just dumping core.
Source: pre/sparc-sunos5.9/magic_elf/magic_elf.c
#include <unistd.h>
int main() { write(1, (void*)0x10001, 3); return 0; } |
Command: pre/sparc-sunos5.9/magic_elf/cc.sh
#!/usr/xpg4/bin/sh
/opt/sfw/bin/gcc \
-Wall -O1 -I . -I out/sparc-sunos5.9 -D NDEBUG \
-o tmp/sparc-sunos5.9/magic_elf/magic_elf \
pre/sparc-sunos5.9/magic_elf/magic_elf.c \
&& tmp/sparc-sunos5.9/magic_elf/magic_elf |
Output: out/sparc-sunos5.9/magic_elf/magic_elf
ELF |
The three letters are part of the signature of ELF files. Executables created by ld(1) are always mapped into the same memory region. That's why the program can find its own header at a predictable virtual address.
RTFM. [2] Just read all of Executable and linkable format (i).
0x10000 is not a natural constant, but happens to be the default base address of ELF executables produced by ld(1) on sparc. Option -Ttext ORG and --section-start SECTIONNAME=ORG of GNU ld should allow to change it, but I didn't get it working. Anyway, the layout of executables produced by ld(1) is straight forward.
One ELF header - Elf32_Ehdr
Program headers - Elf32_Phdr
Program interpreter (not if statically linked)
Code
Data
Section headers - Elf32_Shdr
Everything from the start of the file to the last byte of code is mapped into one segment (colloquially named "code" or "text") that begins at the base address. There is a whole chapter called Segments describing tools to view all these details. In the meantime I will show fancy ways to get by without.
What would you do if you knew nothing about ELF and just asked yourself how that example works? How can you go sure that the executable file really contains those three letters?
A good start for finding text in binary files is strings(1).
Command: pre/sparc-sunos5.9/magic_elf/strings.sh
#!/usr/xpg4/bin/sh
# without "-a -n 3" we don't get any output
/usr/bin/strings -a -n 3 \
tmp/sparc-sunos5.9/magic_elf/magic_elf \
| /usr/xpg4/bin/grep -n ELF |
Output: out/sparc-sunos5.9/magic_elf/strings
1:ELF |
The leading 1: is written by grep(1) and tells that our three-letter word is the first found string. This gives some help where we can find it in a hex dump. It is difficult to search strings in such a dump because of line breaks. Interactive tools like hexedit(1) or khexedit(1) might be useful.
The traditional tool for dumps is called od(1), the abbreviation for "octal dump". The classic version does provide hexadecimal output, but unfortunately not for single bytes. Option -x outputs "words", which are defined to be two bytes. This does not matter on big-endian machines like the sparc, but on i386 and alpha it is quite confusing.
On both Linux and SunOS od features option -tx1 to get byte-wise hexadecimals. FreeBSD's od has no equivalent. Another interesting option is -N count which reads no more than count bytes of input. Again it is not available on FreeBSD.
Command: pre/sparc-sunos5.9/magic_elf/od/SunOS.sh
#!/usr/xpg4/bin/sh
src=tmp/sparc-sunos5.9/magic_elf/magic_elf
/usr/xpg4/bin/od -N 16 -c ${src} | /usr/xpg4/bin/sed 1q
/usr/xpg4/bin/od -N 16 -x ${src} | /usr/xpg4/bin/sed 1q
/usr/xpg4/bin/od -N 16 -tx1 ${src} | /usr/xpg4/bin/sed 1q
/usr/xpg4/bin/od -N 16 -ta ${src} | /usr/xpg4/bin/sed 1q |
Output: out/sparc-sunos5.9/magic_elf/od
0000000 177 E L F 001 002 001 \0 \0 \0 \0 \0 \0 \0 \0 \0
0000000 7f45 4c46 0102 0100 0000 0000 0000 0000
0000000 7f 45 4c 46 01 02 01 00 00 00 00 00 00 00 00 00
0000000 del E L F soh stx soh nul nul nul nul nul nul nul nul nul |
hexdump features user defined formats. It is a typical ingredient of Linux and FreeBSD installations, but not shipped with SunOS.
xxd is part of vim. [3] And though that can't be called installation core it comes pretty close on Linux distributions.
Source: pre/sparc-sunos5.9/magic_elf/xxd.sh
#!/usr/xpg4/bin/sh
/opt/sfw/bin/xxd -l 80 \
< tmp/sparc-sunos5.9/magic_elf/magic_elf |
Output: out/sparc-sunos5.9/magic_elf/xxd
0000000: 7f45 4c46 0102 0100 0000 0000 0000 0000 .ELF............
0000010: 0002 0002 0000 0001 0001 042c 0000 0034 ...........,...4
0000020: 0000 13ec 0000 0000 0034 0020 0005 0028 .........4. ...(
0000030: 001b 0019 0000 0006 0000 0034 0001 0034 ...........4...4
0000040: 0000 0000 0000 00a0 0000 00a0 0000 0005 ................ |
Anyway, at this point we can guess that file offset 1 and 0x10000 + 1 are not coincidental. A test program might help.
Source: pre/sparc-sunos5.9/magic_elf/addr_of_main.c
#include <stdio.h>
int main()
{
printf("# sizeof(unsigned long)=%u\n", (unsigned)sizeof(unsigned long));
printf("# 10000=%#02x\n", *(unsigned char*)0x10000);
printf("# 10001=%.3s\n", (char*)0x10001);
/* output of %p can either be hex or decimal, so comment it out */
printf("# addr_main_p=%p\n\n", main);
/* suffix "_x" is required by post-processing (etc/calc.pl) */
printf("addr_main_x=%lx\n", (unsigned long)main);
printf("ofs_main_x=%lu\n", (unsigned long)main - 0x10000);
printf("ofs_main=%lu\n", (unsigned long)main - 0x10000);
return 0;
} |
Output: out/sparc-sunos5.9/magic_elf/addr_of_main
# sizeof(unsigned long)=4
# 10000=0x7f
# 10001=ELF
# addr_main_p=105c0
addr_main_x=105c0
ofs_main_x=1472
ofs_main=1472 |
Note that the output of %p is not standardized. Some platforms print a leading 0x, some don't. Even %#p does not guarantee a leading 0x. Anyway, output looks good. The byte at address 0x10000 + 0 is equal to that at file offset 0. And 0x105c0 is a plausible address of function main.
Source: pre/sparc-sunos5.9/magic_elf/other_perl.pl
#!/usr/perl5/5.6.1/bin/perl
syscall 4, 1, 0x10001, 3 |
Output: out/sparc-sunos5.9/magic_elf/other_perl
ELF |
Source: pre/sparc-sunos5.9/magic_elf/other_exe.sh
#!/usr/xpg4/bin/sh
/usr/bin/dd if=/proc/self/object/a.out bs=1 skip=1 count=3 2>/dev/null |
Output: out/sparc-sunos5.9/magic_elf/other_exe
ELF |
I found nothing equivalent to Linux's /proc/self/mem.
[1] | On this platform it's actually /usr/bin/csh. This is the result of a systematic search at A kingdom for a shell. |
[2] | |
[3] |