3. Magic revealed

 

It looks worse than you can imagine!

I can imagine some pretty bad things!

That's why I said *worse*!

 Terry Pratchett, Moving Pictures

The fancy output format of The address of main was chosen for a reason. It is valid input for /bin/sh. Let's see whether the program from The magic of the Elf has main at the same offset.

Command: pre/i386-redhat7.3-linux/magic_elf/ndisasm.sh
#!/bin/sh
. out/i386-redhat7.3-linux/magic_elf/addr_of_main
/usr/bin/ndisasm -e ${ofs} -o 0x${main_l} -U \
	tmp/i386-redhat7.3-linux/magic_elf/magic_elf \
| /bin/sed -e "/\<\(ret\|hlt\)\>/q"

Output: out/i386-redhat7.3-linux/magic_elf/ndisasm.asm
08048400  55                push ebp
08048401  89E5              mov ebp,esp
08048403  83EC0C            sub esp,byte +0xc
08048406  6A03              push byte +0x3
08048408  6801800408        push dword 0x8048001
0804840D  6A01              push byte +0x1
0804840F  E8BCFEFFFF        call 0x80482d0
08048414  B800000000        mov eax,0x0
08048419  C9                leave
0804841A  C3                ret

While the simplicity of ndisasm is charming, objdump requires heavy machinery. The output includes function labels. Filtering the complete disassembly can yield the desired code without prior knowledge of the function address. We use --start-address for symmetry with ndisasm, however. That option accepts only numeric values, not symbol names.

Command: pre/i386-redhat7.3-linux/magic_elf/objdump.sh
#!/bin/sh
. out/i386-redhat7.3-linux/magic_elf/addr_of_main
/usr/bin/objdump -d --start-address=0x${main_l} \
	tmp/i386-redhat7.3-linux/magic_elf/magic_elf \
| pre/i386-redhat7.3-linux/magic_elf/objdump_format.sh ${main_l}

Command: pre/i386-redhat7.3-linux/magic_elf/objdump_format.sh
#!/bin/sh
xdigit='[[:xdigit:]]\{1,\}'
start_address=${1:-$xdigit}

# white space is tab-stop, not just spaces
tab='[^	]\{1,\}	'
space='[[:space:]]\{1,\}'
nospace='[^[:space:]]\{1,\}'

/bin/sed -n -e "/^ *${start_address}:/,$ p" \
| /bin/sed \
	-e 's/ *	/	/g' \
	-e "s/${space}\(;\)/	\1/" \
	-e "s/^\(${tab}${tab}${nospace}\)${space}/\1	/" \
	-e "/^[[:space:]]*\.*[[:space:]]*$/q" \
	-e "/\<\(ret\|hlt\)\>/q" \
| /usr/bin/expand -t 12,32,40,60

Output: out/i386-redhat7.3-linux/magic_elf/objdump.asm
 8048400:   55                  push    %ebp
 8048401:   89 e5               mov     %esp,%ebp
 8048403:   83 ec 0c            sub     $0xc,%esp
 8048406:   6a 03               push    $0x3
 8048408:   68 01 80 04 08      push    $0x8048001
 804840d:   6a 01               push    $0x1
 804840f:   e8 bc fe ff ff      call    80482d0 <_init+0x38>
 8048414:   b8 00 00 00 00      mov     $0x0,%eax
 8048419:   c9                  leave   
 804841a:   c3                  ret     

This looks like a real main. So both programs indeed have main at the same offset. Unfortunately a brief look through /bin proves this to be pure chance. And instead of a real system call for write(2) we see something strange. It resolves to a location in a shared library. But what function in what library?

3.1. GDB to the rescue

Output: out/i386-redhat7.3-linux/magic_elf/gdb
[func=main]
0x8048400 <main>:               push    ebp
0x8048401 <main+1>:             mov     ebp,esp
0x8048403 <main+3>:             sub     esp,0xc
0x8048406 <main+6>:             push    0x3
0x8048408 <main+8>:             push    0x8048001
0x804840d <main+13>:            push    0x1
0x804840f <main+15>:            call    0x80482d0 <write>
0x8048414 <main+20>:            mov     eax,0x0

Looks better. We need a way to retrieve the function name, write, from this output. Then we can feed gdb this argument for disassembly.

Command: pre/i386-redhat7.3-linux/evil_magic/first_gdb_func.sed
#!/bin/sed -nf
/.*<\(.*\)>$/ {
	s//\1/
	p
	q
}

Command: pre/i386-redhat7.3-linux/evil_magic/gdb_write.sh
#!/bin/sh
file=${1:-tmp/i386-redhat7.3-linux/magic_elf/magic_elf}
func=$( pre/i386-redhat7.3-linux/evil_magic/first_gdb_func.sed \
	< out/i386-redhat7.3-linux/magic_elf/gdb )

/bin/echo "[func=${func}]"
pre/i386-redhat7.3-linux/magic_elf/gdb_core.sh ${file} ${func} \
| pre/i386-redhat7.3-linux/magic_elf/gdb_format.sh

Output: out/i386-redhat7.3-linux/evil_magic/write.gdb
[func=write]
0x80482d0 <write>:              jmp     ds:0x8049584
0x80482d6 <write+6>:            push    0x8
0x80482db <write+11>:           jmp     0x80482b0 <_init+24>

Oops. Shared libraries don't share their secrets with everyone.

3.2. In doubt use force

We can now search for a fine manual explaining how to debug shared libraries. Or just compile the bugger static.

Seems we found an easy way to fill up the hard disk. Anyway, what has gdb(1) to say about it?

The function was called write before, it is called write now. Let's look what is behind the name.

3.3. Write your name

Above disassembly is not guaranteed to work. The names of symbols imported by libraries differ from one platform to the other, and from one compiler to the other. A more rational approach is to search the listing of all symbols for similar names and identical addresses.

I suspect there is actually order behind the chaos. The symbol __write, with a varying number of leading underscores, seems to be "the real thing" on all platforms. The aliases for the value, 0x804ccf0, differ a lot.

There are two man pages giving some overview of system calls, intro(2) and syscalls(2). /usr/include/unistd.h declares a traditional general purpose interface called syscall. Not all Linux system have man page syscall(2), though. Anyway, the statement mov eax,4 corresponds to the value of __NR_write in /usr/include/asm/unistd.h.