Outside of a dog, a book is a man's best friend. Inside a dog it's too dark to read. | |
Groucho Marx |
ELF provides two parallel views of a file's contents. The linking view is defined by the section header table, an array of Elf32_Shdr. The execution view is defined by the program header table, an array of Elf32_Phdr.
Theoretically the ELF specification is quite liberal. Position, contents and order of sections and segments are not restricted. But in real life an operating system is to used to just one program loader, one linker and few compilers. This makes the work of virus writers easier. We can reverse engineer the de-facto standard, a tiny subset of what the ELF standard allows. On a typical system only a minority of programs violates this subset, so ignoring them does not lower chances of survival.
The entries of both section header table and program header table are ordered, consecutive, non-overlapping and cover every byte of the file. The standard describes section headers as optional for programs, but you can't build dynamically linked executables without them. Still worse, strip(1) performs a destructive operation on the section headers that will break infected executables if we don't maintain the section headers as well.
Let's get a bit more serious and examine the assembly program from The language of evil. A standalone executable built from assembler source is probably the most trivial example we can find.
Command: pre/sparc-sunos5.7/readelf/segments/objdump.sh
#!/usr/xpg4/bin/sh
/usr/xpg4/bin/ls -Ll tmp/sparc-sunos5.7/evil_magic/att
/usr/local/bin/objdump -fp tmp/sparc-sunos5.7/evil_magic/att |
Output: out/sparc-sunos5.7/readelf/segments/objdump
-rwxr-xr-x 1 alba alba 444 Oct 23 01:57 tmp/sparc-sunos5.7/evil_magic/att
tmp/sparc-sunos5.7/evil_magic/att: file format elf32-sparc
architecture: sparc, flags 0x00000102:
EXEC_P, D_PAGED
start address 0x0000000000010074
Program Header:
LOAD off 0x0000000000000000 vaddr 0x0000000000010000 paddr 0x0000000000010000 align 2**16
filesz 0x0000000000000098 memsz 0x0000000000000098 flags r-x
LOAD off 0x0000000000000098 vaddr 0x0000000000020098 paddr 0x0000000000020098 align 2**16
filesz 0x0000000000000000 memsz 0x0000000000000000 flags rw-
|
objdump's output is butt-ugly. On to readelf.
Command: pre/sparc-sunos5.7/readelf/segments/readelf.sh
#!/usr/xpg4/bin/sh
/usr/local/bin/readelf -l tmp/sparc-sunos5.7/evil_magic/att |
Output: out/sparc-sunos5.7/readelf/segments/readelf
Elf file type is EXEC (Executable file)
Entry point 0x10074
There are 2 program headers, starting at offset 52
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
LOAD 0x000000 0x00010000 0x00010000 0x00098 0x00098 R E 0x10000
LOAD 0x000098 0x00020098 0x00020098 0x00000 0x00000 RW 0x10000
Section to Segment mapping:
Segment Sections...
00 .text
01 |
And just to complete the confusion a look on the native Solaris tool.
Command: pre/sparc-sunos5.7/readelf/segments/elfdump.sh
#!/usr/xpg4/bin/sh
/usr/ccs/bin/elfdump -ep tmp/sparc-sunos5.7/evil_magic/att |
Output: out/sparc-sunos5.7/readelf/segments/elfdump
ELF Header
ei_magic: { 0x7f, E, L, F }
ei_class: ELFCLASS32 ei_data: ELFDATA2MSB
e_machine: EM_SPARC e_version: EV_CURRENT
e_type: ET_EXEC
e_flags: 0
e_entry: 0x10074 e_ehsize: 52 e_shstrndx: 5
e_shoff: 0xcc e_shentsize: 40 e_shnum: 6
e_phoff: 0x34 e_phentsize: 32 e_phnum: 2
Program Header[0]:
p_vaddr: 0x10000 p_flags: [ PF_X PF_R ]
p_paddr: 0x10000 p_type: [ PT_LOAD ]
p_filesz: 0x98 p_memsz: 0x98
p_offset: 0 p_align: 0x10000
Program Header[1]:
p_vaddr: 0x20098 p_flags: [ PF_W PF_R ]
p_paddr: 0x20098 p_type: [ PT_LOAD ]
p_filesz: 0 p_memsz: 0
p_offset: 0x98 p_align: 0x10000 |
Nice to see the entry point (0x10074) again. Program layout is a simplified variation of Sort of an answer. The value of FileSiz includes ELF header and program header. The size of this overhead is:
So effective code size is:overhead = Entry point - VirtAddr = 0x10074 - 0x10000 = 0x74 = 116 bytes
This matches with the disassembly listing. However, the ratio of file size to effective code deserves the title "Bloat", with capital B.code size = FileSiz - overhead = 0x98 - 0x74 = 0x24 = 36 bytes
Only 8 percent of the file actually do something useful!code size / file size = 36 / 444 = 0.081
readelf(1) features another option, -S. objdump(1) calls that -h.
Command: pre/sparc-sunos5.7/readelf/sections/objdump.sh
#!/usr/xpg4/bin/sh
/usr/local/bin/objdump -h tmp/sparc-sunos5.7/evil_magic/att |
Output: out/sparc-sunos5.7/readelf/sections/objdump
tmp/sparc-sunos5.7/evil_magic/att: file format elf32-sparc
Sections:
Idx Name Size VMA LMA File off Algn
0 .text 00000024 0000000000010074 0000000000010074 00000074 2**0
CONTENTS, ALLOC, LOAD, READONLY, CODE
1 .data 00000000 0000000000020098 0000000000020098 00000098 2**0
CONTENTS, ALLOC, LOAD, DATA
2 .sbss 00000000 0000000000020098 0000000000020098 00000098 2**0
CONTENTS
3 .bss 00000000 0000000000020098 0000000000020098 00000098 2**0
ALLOC |
objdump's output for sections is outright disgusting. A real problem is the broken numbering due to ignored entries in the section table. The item on index 0 is actually of type SHT_NULL. Its index (SHN_UNDEF = 0) serves to mark an unused value of sh_link. Less troublesome is the ignored string table, a section of type STRTAB.
Command: pre/sparc-sunos5.7/readelf/sections/readelf.sh
#!/usr/xpg4/bin/sh
/usr/local/bin/readelf -S tmp/sparc-sunos5.7/evil_magic/att |
Output: out/sparc-sunos5.7/readelf/sections/readelf
There are 6 section headers, starting at offset 0xcc:
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 0] NULL 00000000 000000 000000 00 0 0 0
[ 1] .text PROGBITS 00010074 000074 000024 00 AX 0 0 1
[ 2] .data PROGBITS 00020098 000098 000000 00 WA 0 0 1
[ 3] .sbss PROGBITS 00020098 000098 000000 00 W 0 0 1
[ 4] .bss NOBITS 00020098 000098 000000 00 WA 0 0 1
[ 5] .shstrtab STRTAB 00000000 000098 000032 00 0 0 1
Key to Flags:
W (write), A (alloc), X (execute), M (merge), S (strings)
I (info), L (link order), G (group), x (unknown)
O (extra OS processing required) o (OS specific), p (processor specific) |
elfdump also ommits section SHT_NULL but at least gets the numbering right.
Command: pre/sparc-sunos5.7/readelf/sections/elfdump.sh
#!/usr/xpg4/bin/sh
/usr/ccs/bin/elfdump -c tmp/sparc-sunos5.7/evil_magic/att |
Output: out/sparc-sunos5.7/readelf/sections/elfdump
Section Header[1]: sh_name: .text
sh_addr: 0x10074 sh_flags: [ SHF_ALLOC SHF_EXECINSTR ]
sh_size: 0x24 sh_type: [ SHT_PROGBITS ]
sh_offset: 0x74 sh_entsize: 0
sh_link: 0 sh_info: 0
sh_addralign: 0x1
Section Header[2]: sh_name: .data
sh_addr: 0x20098 sh_flags: [ SHF_WRITE SHF_ALLOC ]
sh_size: 0 sh_type: [ SHT_PROGBITS ]
sh_offset: 0x98 sh_entsize: 0
sh_link: 0 sh_info: 0
sh_addralign: 0x1
Section Header[3]: sh_name: .sbss
sh_addr: 0x20098 sh_flags: [ SHF_WRITE ]
sh_size: 0 sh_type: [ SHT_PROGBITS ]
sh_offset: 0x98 sh_entsize: 0
sh_link: 0 sh_info: 0
sh_addralign: 0x1
Section Header[4]: sh_name: .bss
sh_addr: 0x20098 sh_flags: [ SHF_WRITE SHF_ALLOC ]
sh_size: 0 sh_type: [ SHT_NOBITS ]
sh_offset: 0x98 sh_entsize: 0
sh_link: 0 sh_info: 0
sh_addralign: 0x1
Section Header[5]: sh_name: .shstrtab
sh_addr: 0 sh_flags: 0
sh_size: 0x32 sh_type: [ SHT_STRTAB ]
sh_offset: 0x98 sh_entsize: 0
sh_link: 0 sh_info: 0
sh_addralign: 0x1 |
The most interesting entry is .text. The start of this section, 0x10074, equals the entry point. This is not a coincidence or the degenerated case of a trivial program. Further down it is demonstrated on /usr/xpg4/bin/sh. And Scan entry point shows it to be generally true, though it is nowhere specified in the standard. A search through standard places like /bin:
Output: out/sparc-sunos5.7/scanner/entry_point_big
files=0731; detected=0000 |
Anyway, we see that even for trivial examples the code is surrounded by lots of other stuff. Let's zoom in on our target.
Command: pre/sparc-sunos5.7/readelf/sh/segments/readelf.sh
#!/usr/xpg4/bin/sh
/usr/local/bin/readelf -l /usr/xpg4/bin/sh |
Output: out/sparc-sunos5.7/readelf/sh/segments/readelf
Elf file type is EXEC (Executable file)
Entry point 0x17560
There are 5 program headers, starting at offset 52
Program Headers:
Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
PHDR 0x000034 0x00010034 0x00000000 0x000a0 0x000a0 R E 0
INTERP 0x0000d4 0x00000000 0x00000000 0x00011 0x00000 R 0
[Requesting program interpreter: /usr/lib/ld.so.1]
LOAD 0x000000 0x00010000 0x00000000 0x2e102 0x2e102 R E 0x10000
LOAD 0x02e104 0x0004e104 0x00000000 0x00b18 0x0201f RWE 0x10000
DYNAMIC 0x02e6e8 0x0004e6e8 0x00000000 0x000b8 0x00000 RWE 0
Section to Segment mapping:
Segment Sections...
00
01
02 .interp .hash .dynsym .dynstr .SUNW_version .rela.ex_shared .rela.data .rela.bss .rela.plt .text .init .fini .exception_ranges .rodata .rodata1
03 .got .plt .dynamic .ex_shared .data .data1 .bss
04 |
Looks intimidating. But then the ELF specification says that only segments of type "LOAD" are considered for execution. Since the flags of the first one include "execute" but not "write" it must be the code segment. The other one has the "write" flags set, so it must be the data segment. There is one possible deviation: On sparc-sunos most executables built by Sun feature a data segment with "execute" flag.
Previous examples in The language of evil used an __attribute__ clause to put the code into section .text. Without that it would end up in section .rodata. Both are members of the code segment which is executable in it its entireness; in this regard that would make no difference.
But what about putting the code a write enabled data segment? These settings can probably be changed by mprotect(2). But what are the default settings?
Output = Source: out/sparc-sunos5.7/evil_magic/func.inc
const unsigned char in_code[]
__attribute__ (( aligned(8), section(".text") )) =
{
0x82,0x10,0x20,0x04, /* 0: mov 4, %g1 */
0x90,0x10,0x20,0x01, /* 4: mov 1, %o0 */
0x13,0x00,0x00,0x40, /* 8: sethi %hi(0x10000), %o1 */
0x92,0x12,0x60,0x01, /* c: or %o1, 1, %o1 */
0x94,0x10,0x20,0x03, /* 10: mov 3, %o2 */
0x91,0xd0,0x20,0x08, /* 14: ta 8 */
0x81,0xc3,0xe0,0x08, /* 18: retl */
0x01,0x00,0x00,0x00 /* 1c: nop */
}; /* 32 bytes (0x20) */ |
Source: pre/sparc-sunos5.7/evil_magic/self_modify.c
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include "func.inc"
typedef void (*PfnVoid)(void);
#define TEST(where) \
printf("\n%08p is " #where " ... ", in_##where); \
(*(PfnVoid)in_##where)();
#define MEMCPY_TEST(where) \
memcpy(in_##where, in_code, sizeof(in_code)); \
TEST(where)
static char in_data[sizeof(in_code)];
int main()
{
char* in_heap = malloc(sizeof(in_code));
char in_stack[sizeof(in_code)];
int rc;
setvbuf(stdout, 0, _IONBF, 0);
TEST(code);
MEMCPY_TEST(data);
MEMCPY_TEST(heap);
MEMCPY_TEST(stack);
return 0;
} |
Output: out/sparc-sunos5.7/evil_magic/self_modify
000106e0 is code ... ELF
000217dc is data ... ELF
00021810 is heap ... ELF
effffa48 is stack ... ELF |
MemSiz (0x201f) is larger than FileSiz (0xb18) in the data segment. Just like with mmap(2) excessive bytes are defined to be initialized with 0. The linker takes advantages of that by grouping all variables that should be initialized to zero at the end. Note that the last section of segment 3 (counting starts with 0) is called .bss, the traditional name for this kind of area.
The mapping for segment 2 looks even more complex. But I would guess that .rodata means "read-only data" and .text contains productive code, as opposed to the administrative stuff in the other sections. LSB [1] has a good overview of section names. [2] Anyway, a detailed look on section .text shows that its start address (0x17560) equals the entry point.
Command: pre/sparc-sunos5.7/readelf/sh/sections/readelf.sh
#!/usr/xpg4/bin/sh
/usr/local/bin/readelf -S /usr/xpg4/bin/sh \
| /usr/xpg4/bin/grep '\.text' |
Output: out/sparc-sunos5.7/readelf/sh/sections/readelf
[10] .text PROGBITS 00017560 007560 024cac 00 AX 0 0 4 |
Some executables of Red Hat 8.0 have an additional program header of type GNU_EH_FRAME.
[1] | |
[2] | http://www.linuxbase.org/spec/gLSB/gLSB/specialsections.html |