2. The language of evil

 

"Evil does seek to maintain power by suppressing the truth."

"Or by misleading the innocent."

  Spock and McCoy, "And The Children Shall Lead", star date 5029.5.

The fancy output format of The address of main was chosen for a reason. It is valid input for /bin/sh. Let's see whether In the language of mortals has main at the same offset.

Command: src/magic_elf/intel.sh
#!/bin/sh
. ${OUT}/magic_elf/addr_of_main
ndisasm -e ${ofs} -o ${main} -U ${TMP}/magic_elf/magic_elf \
| sed -e '/ret/q'

Output: out/i386-redhat-linux/magic_elf/disasm
08048400  55                push ebp
08048401  89E5              mov ebp,esp
08048403  83EC0C            sub esp,byte +0xc
08048406  6A03              push byte +0x3
08048408  6801800408        push dword 0x8048001
0804840D  6A01              push byte +0x1
0804840F  E8BCFEFFFF        call 0x80482d0
08048414  31C0              xor eax,eax
08048416  89EC              mov esp,ebp
08048418  5D                pop ebp
08048419  C3                ret

While the simplicity of ndisasm is charming, the glitches in objdump's output require heavy machinery. Note that character ";" starts a comment on i386. On sparc a "!" is used instead.

Command: src/magic_elf/objdump.sh
#!/bin/sh
. ${OUT}/magic_elf/addr_of_main
${OBJDUMP} --start-address=${main} -d ${TMP}/magic_elf/magic_elf \
| src/magic_elf/objdump_format.sh 

Command: src/magic_elf/objdump_format.sh
#!/bin/sh
# white space is tab-stop, not just spaces
tab='[^	]\{1,\}	'
space='[[:space:]]\{1,\}'
nospace='[^[:space:]]\{1,\}'

sed -n -e 's/ *	/	/g' \
	-e "s/${space}\([;!]\)/	\1/" \
	-e "s/^\(${tab}${tab}${nospace}\)${space}/\1	/" \
	-e '/^ *[[:xdigit:]]*:/,$ p' \
	-e '/ret/q' \
| expand -t 12,32,40,60

Output: out/i386-redhat-linux/magic_elf/objdump
 8048400:   55                  push    %ebp
 8048401:   89 e5               mov     %esp,%ebp
 8048403:   83 ec 0c            sub     $0xc,%esp
 8048406:   6a 03               push    $0x3
 8048408:   68 01 80 04 08      push    $0x8048001
 804840d:   6a 01               push    $0x1
 804840f:   e8 bc fe ff ff      call    80482d0 <_init+0x38>
 8048414:   31 c0               xor     %eax,%eax
 8048416:   89 ec               mov     %ebp,%esp
 8048418:   5d                  pop     %ebp
 8048419:   c3                  ret     

Both programs have main at the same file offset. Unfortunately a brief look through /bin proves this to be pure chance. And instead of a real system call for write we see a call to strange negative address (check the opcode). It resolves to a location in a shared library. But what function in what library?

Command: src/magic_elf/gdb-core.sh
#!/bin/sh
gdb ${1} -q <<EOF
	set disassembly-flavor ${ASM_STYLE}
	disassemble ${2}
EOF

Command: src/magic_elf/gdb.sh
#!/bin/sh
file=${1:-${TMP}/magic_elf/magic_elf}
func=${2:-main}
src/magic_elf/gdb-core.sh ${file} ${func} \
| src/magic_elf/gdb-format.sh

Output: out/i386-redhat-linux/magic_elf/gdb
0x8048400 <main>:       push    ebp
0x8048401 <main+1>:     mov     ebp,esp
0x8048403 <main+3>:     sub     esp,0xc
0x8048406 <main+6>:     push    0x3
0x8048408 <main+8>:     push    0x8048001
0x804840d <main+13>:    push    0x1
0x804840f <main+15>:    call    0x80482d0 <write>
0x8048414 <main+20>:    xor     eax,eax
0x8048416 <main+22>:    mov     esp,ebp
0x8048418 <main+24>:    pop     ebp
0x8048419 <main+25>:    ret     

Not shown is a pathetic attempt to single-step to the actual code of write.

2.1. In doubt use force

We can now search for a fine manual explaining how to debug shared libraries. Or just compile the bugger static.

Seems we found an easy way to fill up the hard disk. Anyway, what has gdb(1) to say about it?

The function was called write before, it is called write now. Let's look what is behind the name.

There are two man pages giving some overview of system calls, intro(2) and syscalls(2). The statement mov eax,4 corresponds to the value of __NR_write in /usr/include/asm/unistd.h.

2.2. In the language of evil

The code generated by gcc(1) is not suitable for a virus. So here comes hand crafted code. [1]

Source: src/evil_magic/i386-Linux.asm
		global	_start
_start:		push	byte 4
		pop	eax		; eax = 4 = write(2)
		xor	ebx,ebx
		inc	ebx		; ebx = 1 = stdout
		mov	ecx,0x08048001	; ecx = magic address
		push	byte 3
		pop	edx		; edx = 3 = three characters
		int	0x80

		xor	eax,eax
		inc	eax		; eax = 1 = exit(2)
		xor	ebx,ebx		; ebx = 0 = return code
		int	0x80

Command: src/evil_magic/intel.sh
#!/bin/sh
nasm -f elf -o ${TMP}/evil_magic/${ASM_STYLE}.o \
	src/evil_magic/${ARCH}-${UNAME}.asm \
&& ld -o ${TMP}/evil_magic/${ASM_STYLE} ${TMP}/evil_magic/${ASM_STYLE}.o \
&& ${TMP}/evil_magic/${ASM_STYLE}

Output: out/i386-redhat-linux/evil_magic/intel
ELF

Output is good. But how do we get the resulting machine code? We can't just add a call to printf(3) to the assembly code. Above example is not linked with glibc; it does not even have a function called main.

2.3. Enter evil

On the other hand things became a lot easier. There is no initialization code that gets executed before _start, so the address of _start is really the ELF entry point of the executable.

A look into /usr/include/elf.h shows that Elf32_Ehdr::e_entry is really at file offset 24.

The entry point is specified as a virtual address in memory. By subtracting the base address we get the file offset:

0x8048080 - 0x8048000 = 0x80 = 128

2.4. Evil magic revealed

2.5. Dressing up binary code

There is still one thing left: Dressing up the hex dump as C source. We use the script from Dressing up binary code.

Output: out/i386-redhat-linux/evil_magic/evil_magic.c
const unsigned char main[]
__attribute__ (( aligned(8), section(".text") )) =
{
  0x6A,0x04,                     /* 08048080: push byte +0x4         */
  0x58,                          /* 08048082: pop eax                */
  0x31,0xDB,                     /* 08048083: xor ebx,ebx            */
  0x43,                          /* 08048085: inc ebx                */
  0xB9,0x01,0x80,0x04,0x08,      /* 08048086: mov ecx,0x8048001      */
  0x6A,0x03,                     /* 0804808B: push byte +0x3         */
  0x5A,                          /* 0804808D: pop edx                */
  0xCD,0x80,                     /* 0804808E: int 0x80               */
  0x31,0xC0,                     /* 08048090: xor eax,eax            */
  0x40,                          /* 08048092: inc eax                */
  0x31,0xDB,                     /* 08048093: xor ebx,ebx            */
  0xCD,0x80                      /* 08048095: int 0x80               */
};

Calling the string constant main is not a mistake. Above output is a complete and valid C program.

Command: src/evil_magic/cc.sh
#!/bin/sh
gcc -Wall -O2 ${OUT}/evil_magic/evil_magic.c \
	-o ${TMP}/evil_magic/cc \
&& ${TMP}/evil_magic/cc

Output: out/i386-redhat-linux/evil_magic/cc
out/i386-redhat-linux/evil_magic/evil_magic.c:2: warning: `main' is usually a function
ELF

Notes

[1]

Optimized for size. Twenty three is the perfect number of bytes. See http://www.goethe.de/uk/mon/archiv/gh00/e23.htm