3. Magic revealed

 

It looks worse than you can imagine!

I can imagine some pretty bad things!

That's why I said *worse*!

 Terry Pratchett, Moving Pictures

The fancy output format of The address of main was chosen for a reason. It is valid input for /bin/sh. Let's check out whether the program from The magic of the Elf has main at the same offset.

3.1. ndisasm

The disassembly filter is based on TEVWH_ASM_RETURN, a regular expression in perl syntax. See objdump_format.pl (i) why sed is not usable.

Command: pre/i386-redhat8.0-linux/magic_elf/ndisasm.sh
#!/bin/bash
src=tmp/i386-redhat8.0-linux/magic_elf/magic_elf
[ -s ${src} ] || exit 1

. out/i386-redhat8.0-linux/magic_elf/addr_of_main
/usr/bin/ndisasm -e ${ofs_main} -o 0x${addr_main_x} -U ${src} \
| /usr/bin/perl \
	-ne "print $_; exit if m/\b(ret|hlt)\b/;"

Output: out/i386-redhat8.0-linux/magic_elf/ndisasm.asm
08048328  55                push ebp
08048329  89E5              mov ebp,esp
0804832B  83EC08            sub esp,byte +0x8
0804832E  83E4F0            and esp,byte -0x10
08048331  83EC04            sub esp,byte +0x4
08048334  6A03              push byte +0x3
08048336  6801800408        push dword 0x8048001
0804833B  6A01              push byte +0x1
0804833D  E816FFFFFF        call 0x8048258
08048342  B800000000        mov eax,0x0
08048347  C9                leave
08048348  C3                ret

3.2. objdump -d

The output of objdump includes function labels. Filtering the complete disassembly can yield the desired code without prior knowledge of the function address. But since we already have the value we use --start-address for symmetry with ndisasm. That option accepts only numeric values, not symbol names.

The filter to pretty up this disassembly is at objdump_format.pl (i)

Output: out/i386-redhat8.0-linux/magic_elf/objdump.asm
 8048328:   55                  push    ebp
 8048329:   89 e5               mov     ebp,esp
 804832b:   83 ec 08            sub     esp,0x8
 804832e:   83 e4 f0            and     esp,0xfffffff0
 8048331:   83 ec 04            sub     esp,0x4
 8048334:   6a 03               push    0x3
 8048336:   68 01 80 04 08      push    0x8048001
 804833b:   6a 01               push    0x1
 804833d:   e8 16 ff ff ff      call    8048258 <_init+0x28>
 8048342:   b8 00 00 00 00      mov     eax,0x0
 8048347:   c9                  leave
 8048348:   c3                  ret

This looks like a real main. So both programs indeed have main at the same offset. Unfortunately a brief look through /bin proves this to be pure chance. And instead of a real system call for write(2) we see something strange. It resolves to a location in a shared library. But what function in what library?

3.3. GDB to the rescue

The filter to pretty up this disassembly is at gdb_format.pl (i)

Command: pre/i386-redhat8.0-linux/magic_elf/gdb.sh
#!/bin/bash
file=${1:-tmp/i386-redhat8.0-linux/magic_elf/magic_elf}
func=${2:-main}

/bin/echo "[func=${func}]"
pre/i386-redhat8.0-linux/magic_elf/gdb_core.sh ${file} ${func} \
| pre/i386-redhat8.0-linux/magic_elf/gdb_format.pl

Output: out/i386-redhat8.0-linux/magic_elf/gdb
[func=main]
0x8048328 <main>:             push          ebp
0x8048329 <main+1>:           mov           ebp,esp
0x804832b <main+3>:           sub           esp,0x8
0x804832e <main+6>:           and           esp,0xfffffff0
0x8048331 <main+9>:           sub           esp,0x4
0x8048334 <main+12>:          push          0x3
0x8048336 <main+14>:          push          0x8048001
0x804833b <main+19>:          push          0x1
0x804833d <main+21>:          call          0x8048258 <write>
0x8048342 <main+26>:          mov           eax,0x0
0x8048347 <main+31>:          leave         
0x8048348 <main+32>:          ret           

Looks better. We need a way to retrieve the function name, write, from this output. Then we can feed gdb this argument for disassembly.

Command: pre/i386-redhat8.0-linux/evil_magic/first_gdb_func.sed
#!/bin/sed -nf
/.*<\(.*\)>$/ {
	s//\1/
	p
	q
}

Command: pre/i386-redhat8.0-linux/evil_magic/gdb_write.sh
#!/bin/bash
file=${1:-tmp/i386-redhat8.0-linux/magic_elf/magic_elf}
func=$( pre/i386-redhat8.0-linux/evil_magic/first_gdb_func.sed \
	< out/i386-redhat8.0-linux/magic_elf/gdb )

/bin/echo "[func=${func}]"
pre/i386-redhat8.0-linux/magic_elf/gdb_core.sh ${file} ${func} \
| pre/i386-redhat8.0-linux/magic_elf/gdb_format.pl

Output: out/i386-redhat8.0-linux/evil_magic/write.gdb
[func=write]
0x8048258 <write>:            jmp           ds:0x804948c
0x804825e <write+6>:          push          0x0
0x8048263 <write+11>:         jmp           0x8048248 <_init+24>

Oops. Shared libraries don't share their secrets with everyone.

3.4. In doubt use force

We can now search for a fine manual explaining how to debug shared libraries. Or just compile the bugger static.

Seems we found an easy way to fill up the hard disk. Anyway, what has gdb(1) to say about it?

The function was called write before, it is called write now. Let's look what is behind the name.

3.5. Write your name

Above disassembly is not guaranteed to work. The names of symbols imported by libraries differ from one platform to the other, and from one compiler to the other. A more rational approach is to search the listing of all symbols for similar names and identical addresses.

I suspect there is actually order behind the chaos. The symbol __write, with a varying number of leading underscores, seems to be "the real thing" on all platforms. The aliases for the value, 0x804d180, differ a lot.

There are two man pages giving some overview of system calls, intro(2) and syscalls(2). /usr/include/unistd.h declares a traditional general purpose interface called syscall. Not all Linux system have man page syscall(2), though. Anyway, the statement mov eax,4 corresponds to the value of __NR_write in /usr/include/asm/unistd.h.