It looks worse than you can imagine! I can imagine some pretty bad things! That's why I said *worse*! | |
Terry Pratchett, Moving Pictures |
The fancy output format of The address of main was chosen for a reason. It is valid input for /bin/sh. Let's check out whether the program from The magic of the Elf has main at the same offset.
The disassembly filters below are based on TEVWH_ASM_RETURN, a regular expression in perl syntax. Unfortunately plain sed on FreeBSD 4.7 has no "\|" or anything equivalent while option -E switches to modern regular expressions with incompatible syntax. The matter is further complicated by branch delay slots. On Sparc the instruction following a ret is executed while the jump is under way. A typical instruction to put there is restore. But triggering on that is not a clean solution.
The output of objdump includes function labels. Filtering the complete disassembly can yield the desired code without prior knowledge of the function address. But since we already have the value we use --start-address for symmetry with ndisasm. That option accepts only numeric values, not symbol names.
Command: pre/sparc-debian2.2-linux/magic_elf/objdump.sh
#!/bin/bash
. out/sparc-debian2.2-linux/magic_elf/addr_of_main
/usr/bin/objdump -d --start-address=0x${addr_main_x} \
tmp/sparc-debian2.2-linux/magic_elf/magic_elf \
| pre/sparc-debian2.2-linux/magic_elf/objdump_format.pl -start_address=${addr_main_x} |
Command: pre/sparc-debian2.2-linux/magic_elf/objdump_format.pl
#!/usr/bin/perl -sw
# Perl 5.005_03 (part of FreeBSD 4.7) does not have [:xdigit:]
$::start_address='[0-9a-fA-F]+' if (!defined($::start_address));
# skip to start address
my $pattern = '^\s*' . $::start_address . ':';
while (<>) { last if m/$pattern/; }
for(;;)
{
s/\s+$//;
my $comment = s/\s+(!)\s*(.*)// ? "$1 $2" : '';
my ( $addr, $hexdump, $asm ) = split(/ *\t/);
my $line = sprintf("%-11s %-19s ", $addr, $hexdump);
$asm = sprintf('%-7s %s', $1, $2) if ($asm =~ m/^(\S+)\s+(.*)/);
$line = sprintf("%-11s %-19s %s", $addr, $hexdump, $asm);
$line = sprintf("%-s59 %s", $line, $comment) if (length($comment) > 0);
print $line . "\n";
last if ($asm =~ m/\b(restore|unimp)\b/);
last if (!($_ = <>));
} |
Output: out/sparc-debian2.2-linux/magic_elf/objdump.asm
1073c: 9d e3 bf 98 save %sp, -104, %sp
10740: 90 10 20 01 mov 1, %o0
10744: 13 00 00 40 sethi %hi(0x10000), %o1
10748: 92 12 60 01 or %o1, 1, %o159 ! 10001 <*ABS*+0x10001>
1074c: 40 00 44 5a call 218b4 <_PROCEDURE_LINKAGE_TABLE_+0x30>
10750: 94 10 20 03 mov 3, %o2
10754: 81 c7 e0 08 ret
10758: 91 e8 20 00 restore %g0, 0, %o0 |
This looks like a real main. So both programs indeed have main at the same offset. Unfortunately a brief look through /bin proves this to be pure chance. And instead of a real system call for write(2) we see something strange. It resolves to a location in a shared library. But what function in what library?
Command: pre/sparc-debian2.2-linux/magic_elf/gdb_core.sh
#!/bin/bash
/usr/bin/gdb ${1} -q <<EOF 2>&1
disassemble ${2}
EOF |
Command: pre/sparc-debian2.2-linux/magic_elf/gdb_format.pl
#!/usr/bin/perl -nw
if (m/([^:]+):\s+(\S+)\s+(.*)/)
{
printf "%-26s%-13s ", $1 . ':', $2;
my $opcode = $2;
my $rest = $3;
if ($rest =~ s/\s+!\s*(.*)//)
{ printf "%-20s! %s\n", $rest, $1; }
else
{ print $rest . "\n"; }
exit(0) if ($opcode =~ m/(restore|unimp)/);
} |
Command: pre/sparc-debian2.2-linux/magic_elf/gdb.sh
#!/bin/bash
file=${1:-tmp/sparc-debian2.2-linux/magic_elf/magic_elf}
func=${2:-main}
/bin/echo "[func=${func}]"
pre/sparc-debian2.2-linux/magic_elf/gdb_core.sh ${file} ${func} \
| pre/sparc-debian2.2-linux/magic_elf/gdb_format.pl |
Output: out/sparc-debian2.2-linux/magic_elf/gdb
[func=main]
0x1073c <main>: save %sp, -104, %sp
0x10740 <main+4>: mov 1, %o0
0x10744 <main+8>: sethi %hi(0x10000), %o1
0x10748 <main+12>: or %o1, 1, %o1 ! 0x10001
0x1074c <main+16>: call 0x218b4 <write>
0x10750 <main+20>: mov 3, %o2
0x10754 <main+24>: ret
0x10758 <main+28>: restore %g0, 0, %o0 |
Looks better. We need a way to retrieve the function name, write, from this output. Then we can feed gdb this argument for disassembly.
Command: pre/sparc-debian2.2-linux/evil_magic/first_gdb_func.sed
#!/bin/sed -nf
/.*<\(.*\)>$/ {
s//\1/
p
q
} |
Command: pre/sparc-debian2.2-linux/evil_magic/gdb_write.sh
#!/bin/bash
file=${1:-tmp/sparc-debian2.2-linux/magic_elf/magic_elf}
func=$( pre/sparc-debian2.2-linux/evil_magic/first_gdb_func.sed \
< out/sparc-debian2.2-linux/magic_elf/gdb )
/bin/echo "[func=${func}]"
pre/sparc-debian2.2-linux/magic_elf/gdb_core.sh ${file} ${func} \
| pre/sparc-debian2.2-linux/magic_elf/gdb_format.pl |
Output: out/sparc-debian2.2-linux/evil_magic/write.gdb
[func=write]
0x218b4 <write>: sethi %hi(0xc000), %g1
0x218b8 <write+4>: b,a 0x21884 <_PROCEDURE_LINKAGE_TABLE_>
0x218bc <write+8>: nop |
Oops. Shared libraries don't share their secrets with everyone.
We can now search for a fine manual explaining how to debug shared libraries. Or just compile the bugger static.
Command: pre/sparc-debian2.2-linux/magic_elf/cc_static.sh
#!/bin/bash
/usr/bin/gcc -Wall -O1 -I . -I out/sparc-debian2.2-linux -D NDEBUG -static \
-o tmp/sparc-debian2.2-linux/magic_elf/magic_elf_static \
pre/sparc-debian2.2-linux/magic_elf/magic_elf.c \
&& /bin/ls -l tmp/sparc-debian2.2-linux/magic_elf \
&& tmp/sparc-debian2.2-linux/magic_elf/magic_elf_static |
Output: out/sparc-debian2.2-linux/magic_elf/magic_elf_static
total 304
-rwxr-xr-x 1 alba alba 11252 Jan 8 23:09 magic_elf
-rwxr-xr-x 1 alba alba 291434 Jan 8 23:09 magic_elf_static
ELF |
Seems we found an easy way to fill up the hard disk. Anyway, what has gdb(1) to say about it?
Output: out/sparc-debian2.2-linux/evil_magic/static_main.gdb
[func=main]
0x101ec <main>: save %sp, -104, %sp
0x101f0 <main+4>: mov 1, %o0
0x101f4 <main+8>: sethi %hi(0x10000), %o1
0x101f8 <main+12>: or %o1, 1, %o1 ! 0x10001
0x101fc <main+16>: call 0x184e8 <write>
0x10200 <main+20>: mov 3, %o2
0x10204 <main+24>: ret
0x10208 <main+28>: restore %g0, 0, %o0 |
The function was called write before, it is called write now. Let's look what is behind the name.
Source: pre/sparc-debian2.2-linux/evil_magic/static_write.sh
#!/bin/bash
file=${1:-tmp/sparc-debian2.2-linux/magic_elf/magic_elf_static}
func=$( pre/sparc-debian2.2-linux/evil_magic/first_gdb_func.sed \
< out/sparc-debian2.2-linux/evil_magic/static_main.gdb )
/bin/echo "[func=${func}]"
pre/sparc-debian2.2-linux/magic_elf/gdb_core.sh ${file} ${func} \
| pre/sparc-debian2.2-linux/magic_elf/gdb_format.pl |
Output: out/sparc-debian2.2-linux/evil_magic/static_write.gdb
[func=write]
0x184e8 <write>: mov 4, %g1 ! 0x4
0x184ec <write+4>: ta 0x10
0x184f0 <write+8>: bcc,a 0x18514 <write+44>
0x184f4 <write+12>: nop
0x184f8 <write+16>: save %sp, -96, %sp
0x184fc <write+20>: call 0x113d8 <__errno_location>
0x18500 <write+24>: nop
0x18504 <write+28>: st %i0, [ %o0 ]
0x18508 <write+32>: restore |
Above disassembly is not guaranteed to work. The names of symbols imported by libraries differ from one platform to the other, and from one compiler to the other. A more rational approach is to search the listing of all symbols for similar names and identical addresses.
Command: pre/sparc-debian2.2-linux/evil_magic/nm.sh
#!/bin/bash
# -p produces same output format on SunOS and GNU
/usr/bin/nm -p tmp/sparc-debian2.2-linux/magic_elf/magic_elf_static \
| /bin/grep '[^[:alnum:]]write\>' \
| /usr/bin/sort |
Output: out/sparc-debian2.2-linux/evil_magic/nm
0000000000012624 t _IO_unbuffer_write
00000000000129a8 T _IO_default_write
00000000000184e8 T __libc_write
00000000000184e8 W __write
00000000000184e8 W write
0000000000019e44 T _IO_do_write
0000000000019e44 T _IO_new_do_write
0000000000019e78 t new_do_write
000000000001a81c T _IO_file_write
000000000001a81c T _IO_new_file_write |
I suspect there is actually order behind the chaos. The symbol __write, with a varying number of leading underscores, seems to be "the real thing" on all platforms. The aliases for the value, 0x184e8, differ a lot.
Command: pre/sparc-debian2.2-linux/evil_magic/gdb_nm.sh
#!/bin/bash
file=${1:-tmp/sparc-debian2.2-linux/magic_elf/magic_elf_static}
# \< and \> don't work on i386-freebsd4.7
func=$( /usr/bin/nm -p ${file} \
| /bin/sed -ne '/.*[tTwW] \(__*write\)/ {
s//\1/
p
q
}' )
/bin/echo "[func=${func}]"
pre/sparc-debian2.2-linux/magic_elf/gdb_core.sh ${file} ${func} \
| pre/sparc-debian2.2-linux/magic_elf/gdb_format.pl |
Output: out/sparc-debian2.2-linux/evil_magic/gdb_nm
[func=__write]
0x184e8 <write>: mov 4, %g1 ! 0x4
0x184ec <write+4>: ta 0x10
0x184f0 <write+8>: bcc,a 0x18514 <write+44>
0x184f4 <write+12>: nop
0x184f8 <write+16>: save %sp, -96, %sp
0x184fc <write+20>: call 0x113d8 <__errno_location>
0x18500 <write+24>: nop
0x18504 <write+28>: st %i0, [ %o0 ]
0x18508 <write+32>: restore |
There are two man pages giving some overview of system calls, intro(2) and syscalls(2). /usr/include/unistd.h declares a traditional general purpose interface called syscall. Not all Linux system have man page syscall(2), though. Anyway, the statement mov 4,%g1 corresponds to the value of __NR_write in /usr/include/asm/unistd.h. Note that Linux uses ta 0x10 while Solaris uses ta 8. There are other differences, but they are beyond the scope of this document.