6. Scanners

 

One can search the brain with a microscope and not find the mind, and can search the stars with a telescope and not find God.

 J. Gustav White

This is the platform independent part of Scanners[ABC].

After finding an exploitable peculiarity you need to verify its existence in a typical population of target executables. And then some peculiarities can be used only once. For example filling the segment gap on i386 removes the gap.

In the beginning scanners were written in perl. Scan entry point is the last remnant of that age. Unfortunately typical 64-bit platforms provide a perl with only 32-bit integers. The classic tool for calculations with large numbers is bc, a front-end to dc. But then classic bc on Solaris has only limited means of formatting output. I failed to write a perl script that writes a bc script that writes the output. Alternatively a bidirectional pipe opened by IPC::Open2(3pm) as described in perlipc(1) could be used to do calculations only. I chose a straight implementation in C instead, using the framework in Dual use technology.

6.1. Finding executables

We start the chapter by putting together a list of target executables. The "big" set consists of system files in the usual places like /bin. The "small" set comprises all infected targets created by the examples in this document. These two groups are further divided into statically or dynamically linked.

At the core of the following script is a combination of file and sed. The only special argument is the postfix required of target file names. A quoted empty string ("") accepts all files in target directories. The other typical value is "_infected". This postfix is used by all examples in this document.

GNU find and GNU xargs provide a nice extension, -print0 and -0, respectively. They are also available on FreeBSD. Another interesting option of GNU xargs, -r, has not made it to FreeBSD, though. Obviously we can't use any of this.

In this script sed does not write all output to stdout. Command "w" is used to write lines matching different patterns to different files. Since the file name is an argument to find-exec.sh the sed script has to be created inline. However, the syntax rules of sed require consecutive commands of a block to be separated by a line feed. And in a shell script single quotes are required to preserve line feeds in command lines. All together this results in a strange sequence of different quotes. The TEVWH_ variables are shown in Variables prefixed with TEVWH_[ABC]. Output and a sample from file are at Finding executables[ABC].

Source: src/scanner/find-exec.sh
#!/bin/sh
dst=${1}; shift
postfix=${1}; shift

${TEVWH_PATH_ECHO} [${postfix}] "$@"
type="ELF ${TEVWH_ELF_ADDR_SIZE}-bit ${TEVWH_BYTE_ORDER}SB executable"

${TEVWH_PATH_FIND} "$@" -type f -perm -111 -name "*${postfix}" \
| ${TEVWH_PATH_XARGS} ${TEVWH_PATH_FILE} \
| ${TEVWH_PATH_SED} -ne \
"/:[[:space:]]*${type}.*statically linked.*/ {"'
	s///
	w '${dst}.static'
	b
}
'"/:[[:space:]]*${type}.*dynamically linked.*/ {"'
	s///
	w '${dst}.dynamic'
	b
}'

# finally output a line count
${TEVWH_PATH_WC} -l ${dst}.static ${dst}.dynamic

6.2. Driver scripts

The first kind of scanner parses the output of objdump. Since this output contains the file name of targets we can call objdump with multiple arguments (through xargs). Scan entry point is now the only species of this kind.

Source: src/scanner/objdump.sh
#!/bin/sh
src="${1}"
dst="${2}"
scanner="${3:-entry_point}"
flags="${4:--fh}"

[ -s "${src}" ] || exit 0
TEVWH_TMP=${TEVWH_TMP}; export TEVWH_TMP

${TEVWH_PATH_XARGS} ${TEVWH_PATH_OBJDUMP} "${flags}" \
< "${src}" \
| "./src/scanner/${scanner}/objdump.pl" \
| ${TEVWH_PATH_TEE} "${dst}.full" \
| ${TEVWH_PATH_TAIL} \
> "${dst}"

The second kind of scanners reads a plain list of target file names from stdin.

Source: src/scanner/plain.sh
#!/bin/sh
src=$1
dst=$2
scanner=${3:-segment_padding}

[ -s ${src} ] || exit 0
TEVWH_TMP=${TEVWH_TMP}; export TEVWH_TMP

${TEVWH_TMP}/scanner/${scanner} < ${src} 2>&1 \
| ${TEVWH_PATH_TEE} ${dst}.full \
| ${TEVWH_PATH_GREP} -v ' Ok$' \
| ${TEVWH_PATH_TAIL} \
> ${dst}

6.3. Scan entry point

This script reads the output of objdump. For each file the start of section .text should equal the entry point. This can be implemented through a simple string comparison, i.e. by checking hexadecimal digits one by one. See Sections[ABC] for an illustrative description based on a dumped ELF header.

Source: src/scanner/entry_point/objdump.pl
#!/usr/bin/perl -w
use strict;

my $tmp = $ENV{'TEVWH_TMP'} || die "TEVWH_TMP undefined.";

my $min = 0xFFFFFFFF; my $max = 0; my $detected = 0;
my $nr_files = 0; my $filename; my $entry_point;
while(<>)
{
  if (m#^(/[^:\s]+):#) { $nr_files++; $filename = $1; next; }
  if (m#^$tmp/([^:\s]+):#) { $nr_files++; $filename = $1; next; }
  if (m#start address 0x([0-9A-Fa-f]+)#) { $entry_point = lc($1); next; }
  if (m#^Idx Name#)
  {
    if (!defined($entry_point))
    {
      printf "%-44s has no entry point.\n", $filename;
      next;
    }
    my $start_of_text;
    while(<>)
    {
      if (m/^\s*\d+\s+.text\s+[0-9A-Fa-f]+\s+([0-9A-Fa-f]+)/)
      {
        $start_of_text = lc($1);
        last;
      }
    }
    $entry_point =~ s/^0+//; $start_of_text =~ s/^0+//; 
    if ($entry_point ne $start_of_text)
    {
      $detected++;
      printf "%-44s ep=0x%-8s sot=0x%-8s\n",
        $filename, $entry_point, $start_of_text;
    }
  }
}
printf "files=%04d; detected=%04d\n", $nr_files, $detected;

6.4. Scan segments

6.4.1. target_action #2

This is a scanner to verify the existence of the gap used in Segment padding infection. The output at Scanners[ABC] is ambiguous. We can positively tell whether the gap exists or not. But we cannot say whether the gap never existed or is indeed occupied by an infection. On platforms other than i386 alignment is larger than page size. It is possible that a small infection taking just one page affects a single target more than once and still leaves a gap larger than the segment alignment. These cases go undetected by this script.

Anyway, det_page is the number of files detected to have a gap smaller equal one page (no further infection possible). det_align is the number of files where the gap is smaller than segment alignment (probably infected). Obviously these criteria overlap.

Source: src/scanner/segment_padding/action.inc
bool target_action(Target* t, int stat[])
{
  TEVWH_ELF_PHDR* phdr_code;
  size_t delta; /* distance between code and data segment (in memory) */

  TRACE_DEBUG(-1, "target_action\n");

  phdr_code = t->phdr_code;
  delta = phdr_code[1].p_vaddr - phdr_code[0].p_vaddr - phdr_code[0].p_memsz;

  /* counters were initialized to zero, real minimum is probably higher */
  if (stat[3] == 0 || delta < stat[3])
    stat[3] = delta; /* minimum */
  if (delta > stat[4])
    stat[4] = delta; /* maximum */

  CHECK_BEGIN(SCAN, delta, >, TEVWH_ELF_ALIGN, -1, long)
    stat[2]++;
  CHECK_END
  CHECK(SCAN, delta, >, TEVWH_ELF_PAGE_SIZE)

  TRACE_SCAN(-1, "%s ... delta=%#x, Ok\n", t->clean_src, delta);
  return true;
}

6.4.2. print_summary #2 (segments)

Outputs the statistics of target_action #2.

Source: src/scanner/segment_padding/print_summary.inc
int print_summary(int stat[])
{
  print_errno(-1, "files=%d; ok=%d; det_page=%d; det_align=%d; "
    "min=0x%04x; max=0x%04x\n",
    stat[0], stat[1], stat[0] - stat[1], stat[2], stat[3], stat[4]
  );
  return 0;
}

6.5. Food for segment padding

The obvious way to test an infected shell is a shell script. By calling other infected executables from this script we can test a whole set in one go. Apart from the shell this means one statically and one dynamically linked file. But then we can't just start any program on chance. For demonstration purposes it would be nice if targets could print a short message, e.g. a version number. We start with a list of known command lines.

After some pre-processing this list can be used with grep -f to search target lists for known command lines. A perfect test set consists of three targets from three sources, find-shell, big.dynamic.ok, big.static.ok. If no infectable shell is found we can take the standard shell to drive the test script. In that case the two other target source can fill up any remaining space. The relative order in these lists shall be maintained. And of course working on duplicates is disgraceful.

grep -v -f can weed out duplicates. But we need to watch out for the special case of an empty pattern file, e.g. by checking with test -s. If the argument to grep -f is an empty file then this matches all lines. Together with -v this rejects all lines.

All together the performance is rather bad, especially since grep will repeatedly process the same input. An implementation in perl can store all intermediate results in memory and beats all shell based solutions by far.

Output is at Food for segment padding[ABC].

6.6. Scan file size

6.6.2. print_summary #3 (file size)

Outputs the statistics of target_action #3.

Source: src/scanner/filesize/print_summary.inc
int print_summary(int stat[])
{
  print_errno(-1, "files=%d; ok=%d; detected=%d\n",
    stat[0], stat[1], stat[0] - stat[1]
  );
  return 0;
}