5. readelf & objdump

 

Outside of a dog, a book is a man's best friend.

Inside a dog it's too dark to read.

 Groucho Marx

ELF provides two parallel views of a file's contents. The linking view is defined by the section header table, an array of Elf32_Shdr. The execution view is defined by the program header table, an array of Elf32_Phdr.

Theoretically the ELF specification is quite liberal. Position, contents and order of sections and segments are not restricted. But in real life an operating system is to used to just one program loader, one linker and few compilers. This makes the work of virus writers easier. We can reverse engineer the de-facto standard, a tiny subset of what the ELF standard allows. On a typical system only a minority of programs violates this subset, so ignoring them does not lower chances of survival.

The entries of both section header table and program header table are ordered, consecutive, non-overlapping and cover every byte of the file. The standard describes section headers as optional for programs, but you can't build dynamically linked executables without them. Still worse, strip(1) performs a destructive operation on the section headers that will break infected executables if we don't maintain the section headers as well.

5.1. Segments

Let's get a bit more serious and examine the assembly program from The language of evil. A standalone executable built from assembler source is probably the most trivial example we can find.

5.1.3. Observations

Nice to see the entry point (0x8048080) again. Program layout is a simplified variation of Sort of an answer. The value of FileSiz includes ELF header and program header. The size of this overhead is:

overhead = Entry point - VirtAddr = 0x8048080 - 0x8048000 = 0x80 = 128 bytes

So effective code size is:

code size = FileSiz - overhead = 0x97 - 0x80 = 0x17 = 23 bytes

This matches with the
disassembly listing. However, the ratio of file size to effective code deserves the title "Bloat", with capital B.

code size / file size = 23 / 416 = 0.055

Only 6 percent of the file actually do something useful!

5.2. Sections

readelf(1) features another option, -S. objdump(1) calls that -h.

5.2.3. Observations

The most interesting entry is .text. The start of this section, 0x8048080, equals the entry point. This is not a coincidence or the degenerated case of a trivial program. Further down it is demonstrated on /bin/sh. And Scan entry point shows it to be generally true, though it is nowhere specified in the standard. A search through standard places like /bin:

Output: out/i386-redhat7.3-linux/scanner/entry_point_big
files=1712; detected=0000

5.3. Bashful glance

Anyway, we see that even for trivial examples the code is surrounded by lots of other stuff. Let's zoom in on our target.

Looks intimidating. But then the ELF specification says that only segments of type "LOAD" are considered for execution. Since the flags of the first one include "execute" but not "write" it must be the code segment. The other one has the "write" flags set, so it must be the data segment. There is one possible deviation: On sparc-sunos most executables built by Sun feature a data segment with "execute" flag.

5.4. Self modifying code

Previous examples in The language of evil used an __attribute__ clause to put the code into section .text. Without that it would end up in section .rodata. Both are members of the code segment which is executable in it its entireness; in this regard that would make no difference.

But what about putting the code a write enabled data segment? These settings can probably be changed by mprotect(2). But what are the default settings?

Output = Source: out/i386-redhat7.3-linux/evil_magic/func.inc
const unsigned char in_code[]
__attribute__ (( aligned(8), section(".text") )) =
{
  0x53,                          /* 00000000: push ebx               */
  0x6A,0x04,                     /* 00000001: push byte +0x4         */
  0x58,                          /* 00000003: pop eax                */
  0x31,0xDB,                     /* 00000004: xor ebx,ebx            */
  0x43,                          /* 00000006: inc ebx                */
  0xB9,0x01,0x80,0x04,0x08,      /* 00000007: mov ecx,0x8048001      */
  0x6A,0x03,                     /* 0000000C: push byte +0x3         */
  0x5A,                          /* 0000000E: pop edx                */
  0xCD,0x80,                     /* 0000000F: int 0x80               */
  0x5B,                          /* 00000011: pop ebx                */
  0xC3                           /* 00000012: ret                    */
}; /* 19 bytes (0x13) */

Source: pre/i386-redhat7.3-linux/evil_magic/self_modify.c
#include <stdlib.h>
#include <stdio.h>
#include <string.h>

#include "func.inc"

typedef void (*PfnVoid)(void);

#define TEST(where) \
	printf("\n%08p is " #where " ... ", in_##where); \
	(*(PfnVoid)in_##where)();
#define MEMCPY_TEST(where) \
	memcpy(in_##where, in_code, sizeof(in_code)); \
	TEST(where)

static char in_data[sizeof(in_code)];

int main()
{
  char* in_heap = malloc(sizeof(in_code));
  char in_stack[sizeof(in_code)];
  int rc;

  setvbuf(stdout, 0, _IONBF, 0);
  TEST(code);
  MEMCPY_TEST(data);
  MEMCPY_TEST(heap);
  MEMCPY_TEST(stack);

  return 0;
}

Output: out/i386-redhat7.3-linux/evil_magic/self_modify

0x8048490 is code ... ELF
0x8049748 is data ... ELF
0x8049788 is heap ... ELF
0xbffff8e0 is stack ... ELF

5.5. Final observations

MemSiz (0x9ad0) is larger than FileSiz (0x5934) in the data segment. Just like with mmap(2) excessive bytes are defined to be initialized with 0. The linker takes advantages of that by grouping all variables that should be initialized to zero at the end. Note that the last section of segment 3 (counting starts with 0) is called .bss, the traditional name for this kind of area.

The mapping for segment 2 looks even more complex. But I would guess that .rodata means "read-only data" and .text contains productive code, as opposed to the administrative stuff in the other sections. LSB [1] has a good overview of section names. [2] Anyway, a detailed look on section .text shows that its start address (0x8059440) equals the entry point.

Command: pre/i386-redhat7.3-linux/readelf/sh/sections/readelf.sh
#!/bin/sh
/usr/bin/readelf -S /bin/sh \
| /bin/grep '\.text'

Output: out/i386-redhat7.3-linux/readelf/sh/sections/readelf
  [12] .text             PROGBITS        08059440 011440 058680 00  AX  0   0 16

Some executables of Red Hat 8.0 have an additional program header of type GNU_EH_FRAME.

Notes

[1]

http://www.linuxbase.org/

[2]

http://www.linuxbase.org/spec/gLSB/gLSB/specialsections.html