It looks worse than you can imagine! I can imagine some pretty bad things! That's why I said *worse*! | |
Terry Pratchett, Moving Pictures |
The fancy output format of The address of main was chosen for a reason. It is valid input for /bin/sh. Let's check out whether the program from The magic of the Elf has main at the same offset.
The disassembly filter is based on TEVWH_ASM_RETURN, a regular expression in perl syntax. See objdump_format.pl (i) why sed is not usable.
Command: pre/i386-redhat8.0-linux/magic_elf/ndisasm.sh
#!/bin/bash
src=tmp/i386-redhat8.0-linux/magic_elf/magic_elf
[ -s ${src} ] || exit 1
. out/i386-redhat8.0-linux/magic_elf/addr_of_main
/usr/bin/ndisasm -e ${ofs_main} -o 0x${addr_main_x} -U ${src} \
| /usr/bin/perl \
-ne "print $_; exit if m/\b(ret|hlt)\b/;" |
Output: out/i386-redhat8.0-linux/magic_elf/ndisasm.asm
08048328 55 push ebp
08048329 89E5 mov ebp,esp
0804832B 83EC08 sub esp,byte +0x8
0804832E 83E4F0 and esp,byte -0x10
08048331 83EC04 sub esp,byte +0x4
08048334 6A03 push byte +0x3
08048336 6801800408 push dword 0x8048001
0804833B 6A01 push byte +0x1
0804833D E816FFFFFF call 0x8048258
08048342 B800000000 mov eax,0x0
08048347 C9 leave
08048348 C3 ret |
The output of objdump includes function labels. Filtering the complete disassembly can yield the desired code without prior knowledge of the function address. But since we already have the value we use --start-address for symmetry with ndisasm. That option accepts only numeric values, not symbol names.
Command: pre/i386-redhat8.0-linux/magic_elf/objdump.sh
#!/bin/bash
. out/i386-redhat8.0-linux/magic_elf/addr_of_main
/usr/bin/objdump -d -Mintel \
--start-address=0x${addr_main_x} \
tmp/i386-redhat8.0-linux/magic_elf/magic_elf \
2>&1 | pre/i386-redhat8.0-linux/magic_elf/objdump_format.pl \
-start_address=${addr_main_x} |
The filter to pretty up this disassembly is at objdump_format.pl (i)
Output: out/i386-redhat8.0-linux/magic_elf/objdump.asm
8048328: 55 push ebp
8048329: 89 e5 mov ebp,esp
804832b: 83 ec 08 sub esp,0x8
804832e: 83 e4 f0 and esp,0xfffffff0
8048331: 83 ec 04 sub esp,0x4
8048334: 6a 03 push 0x3
8048336: 68 01 80 04 08 push 0x8048001
804833b: 6a 01 push 0x1
804833d: e8 16 ff ff ff call 8048258 <_init+0x28>
8048342: b8 00 00 00 00 mov eax,0x0
8048347: c9 leave
8048348: c3 ret |
This looks like a real main. So both programs indeed have main at the same offset. Unfortunately a brief look through /bin proves this to be pure chance. And instead of a real system call for write(2) we see something strange. It resolves to a location in a shared library. But what function in what library?
Command: pre/i386-redhat8.0-linux/magic_elf/gdb_core.sh
#!/bin/bash
/usr/bin/gdb ${1} -q <<EOF 2>&1
set disassembly-flavor intel
disassemble ${2}
EOF |
The filter to pretty up this disassembly is at gdb_format.pl (i)
Command: pre/i386-redhat8.0-linux/magic_elf/gdb.sh
#!/bin/bash
file=${1:-tmp/i386-redhat8.0-linux/magic_elf/magic_elf}
func=${2:-main}
/bin/echo "[func=${func}]"
pre/i386-redhat8.0-linux/magic_elf/gdb_core.sh ${file} ${func} \
| pre/i386-redhat8.0-linux/magic_elf/gdb_format.pl |
Output: out/i386-redhat8.0-linux/magic_elf/gdb
[func=main]
0x8048328 <main>: push ebp
0x8048329 <main+1>: mov ebp,esp
0x804832b <main+3>: sub esp,0x8
0x804832e <main+6>: and esp,0xfffffff0
0x8048331 <main+9>: sub esp,0x4
0x8048334 <main+12>: push 0x3
0x8048336 <main+14>: push 0x8048001
0x804833b <main+19>: push 0x1
0x804833d <main+21>: call 0x8048258 <write>
0x8048342 <main+26>: mov eax,0x0
0x8048347 <main+31>: leave
0x8048348 <main+32>: ret |
Looks better. We need a way to retrieve the function name, write, from this output. Then we can feed gdb this argument for disassembly.
Command: pre/i386-redhat8.0-linux/evil_magic/first_gdb_func.sed
#!/bin/sed -nf
/.*<\(.*\)>$/ {
s//\1/
p
q
} |
Command: pre/i386-redhat8.0-linux/evil_magic/gdb_write.sh
#!/bin/bash
file=${1:-tmp/i386-redhat8.0-linux/magic_elf/magic_elf}
func=$( pre/i386-redhat8.0-linux/evil_magic/first_gdb_func.sed \
< out/i386-redhat8.0-linux/magic_elf/gdb )
/bin/echo "[func=${func}]"
pre/i386-redhat8.0-linux/magic_elf/gdb_core.sh ${file} ${func} \
| pre/i386-redhat8.0-linux/magic_elf/gdb_format.pl |
Output: out/i386-redhat8.0-linux/evil_magic/write.gdb
[func=write]
0x8048258 <write>: jmp ds:0x804948c
0x804825e <write+6>: push 0x0
0x8048263 <write+11>: jmp 0x8048248 <_init+24> |
Oops. Shared libraries don't share their secrets with everyone.
We can now search for a fine manual explaining how to debug shared libraries. Or just compile the bugger static.
Command: pre/i386-redhat8.0-linux/magic_elf/cc_static.sh
#!/bin/bash
/usr/bin/gcc -static \
-Wall -O1 -I . -I out/i386-redhat8.0-linux -D NDEBUG \
-o tmp/i386-redhat8.0-linux/magic_elf/magic_elf_static \
pre/i386-redhat8.0-linux/magic_elf/magic_elf.c \
&& /bin/ls -l tmp/i386-redhat8.0-linux/magic_elf/ \
&& tmp/i386-redhat8.0-linux/magic_elf/magic_elf_static |
Output: out/i386-redhat8.0-linux/magic_elf/magic_elf_static
total 456
-rwxrwxr-x 1 alba alba 11349 Feb 15 23:48 magic_elf
-rwxrwxr-x 1 alba alba 447003 Feb 15 23:48 magic_elf_static
ELF |
Seems we found an easy way to fill up the hard disk. Anyway, what has gdb(1) to say about it?
Output: out/i386-redhat8.0-linux/evil_magic/static_main.gdb
[func=main]
0x80481d0 <main>: push ebp
0x80481d1 <main+1>: mov ebp,esp
0x80481d3 <main+3>: sub esp,0x8
0x80481d6 <main+6>: and esp,0xfffffff0
0x80481d9 <main+9>: sub esp,0x4
0x80481dc <main+12>: push 0x3
0x80481de <main+14>: push 0x8048001
0x80481e3 <main+19>: push 0x1
0x80481e5 <main+21>: call 0x804d180 <write>
0x80481ea <main+26>: mov eax,0x0
0x80481ef <main+31>: leave
0x80481f0 <main+32>: ret |
The function was called write before, it is called write now. Let's look what is behind the name.
Source: pre/i386-redhat8.0-linux/evil_magic/static_write.sh
#!/bin/bash
file=${1:-tmp/i386-redhat8.0-linux/magic_elf/magic_elf_static}
func=$( pre/i386-redhat8.0-linux/evil_magic/first_gdb_func.sed \
< out/i386-redhat8.0-linux/evil_magic/static_main.gdb )
/bin/echo "[func=${func}]"
pre/i386-redhat8.0-linux/magic_elf/gdb_core.sh ${file} ${func} \
| pre/i386-redhat8.0-linux/magic_elf/gdb_format.pl |
Output: out/i386-redhat8.0-linux/evil_magic/static_write.gdb
[func=write]
0x804d180 <write>: push ebx
0x804d181 <write+1>: mov edx,DWORD PTR [esp+16]
0x804d185 <write+5>: mov ecx,DWORD PTR [esp+12]
0x804d189 <write+9>: mov ebx,DWORD PTR [esp+8]
0x804d18d <write+13>: mov eax,0x4
0x804d192 <write+18>: int 0x80
0x804d194 <write+20>: pop ebx
0x804d195 <write+21>: cmp eax,0xfffff001
0x804d19a <write+26>: jae 0x804d8e0 <__syscall_error>
0x804d1a0 <write+32>: ret |
Above disassembly is not guaranteed to work. The names of symbols imported by libraries differ from one platform to the other, and from one compiler to the other. A more rational approach is to search the listing of all symbols for similar names and identical addresses.
Command: pre/i386-redhat8.0-linux/evil_magic/nm.sh
#!/bin/bash
# -p produces same output format on SunOS and GNU
/usr/bin/nm -p tmp/i386-redhat8.0-linux/magic_elf/magic_elf_static \
| /bin/grep '[^[:alnum:]]write\>' \
| /bin/sort |
Output: out/i386-redhat8.0-linux/evil_magic/nm
0804d180 T __libc_write
0804d180 W write
0804d180 W __write
0804fe0c T _IO_default_write
0805edfc T _IO_wdo_write
08061684 T _IO_new_do_write
08061684 W _IO_do_write
0806197c T _IO_new_file_write
0806197c W _IO_file_write |
I suspect there is actually order behind the chaos. The symbol __write, with a varying number of leading underscores, seems to be "the real thing" on all platforms. The aliases for the value, 0x804d180, differ a lot.
Command: pre/i386-redhat8.0-linux/evil_magic/gdb_nm.sh
#!/bin/bash
file=${1:-tmp/i386-redhat8.0-linux/magic_elf/magic_elf_static}
# \< and \> don't work on i386-freebsd4.7
func=$( /usr/bin/nm -p ${file} \
| /bin/sed -ne '/.*[tTwW] \(__*write\)/ {
s//\1/
p
q
}' )
/bin/echo "[func=${func}]"
pre/i386-redhat8.0-linux/magic_elf/gdb_core.sh ${file} ${func} \
| pre/i386-redhat8.0-linux/magic_elf/gdb_format.pl |
Output: out/i386-redhat8.0-linux/evil_magic/gdb_nm
[func=__write]
0x804d180 <write>: push ebx
0x804d181 <write+1>: mov edx,DWORD PTR [esp+16]
0x804d185 <write+5>: mov ecx,DWORD PTR [esp+12]
0x804d189 <write+9>: mov ebx,DWORD PTR [esp+8]
0x804d18d <write+13>: mov eax,0x4
0x804d192 <write+18>: int 0x80
0x804d194 <write+20>: pop ebx
0x804d195 <write+21>: cmp eax,0xfffff001
0x804d19a <write+26>: jae 0x804d8e0 <__syscall_error>
0x804d1a0 <write+32>: ret |
There are two man pages giving some overview of system calls, intro(2) and syscalls(2). /usr/include/unistd.h declares a traditional general purpose interface called syscall. Not all Linux system have man page syscall(2), though. Anyway, the statement mov eax,4 corresponds to the value of __NR_write in /usr/include/asm/unistd.h.