13. Suspicious code

 

Attorney General Edwin Meese III explained why the Supreme Court's Miranda decision (holding that subjects have a right to remain silent and have a lawyer present during questioning) is unnecessary: "You don't have many suspects who are innocent of a crime. That's contradictory. If a person is innocent of a crime, then he is not a suspect."

 U.S. News and World Report, 10/14/85

The tricks shown in Doing it in C raise a question. Can a look at the disassembly find pieces of code that distinguish viruses in general from regular code? You can call that the holy grail of scanning. This is very different from finding specific pieces of code that identify exactly one virus (a signature). Which is again very different from identifying a virus exactly …

The first problem is how to get at "the code". Everything from the start of the file to the last byte of code is mapped into the code segment. This is described in How it works and illustrated in Segments of /bin/tcsh. A virus could hide in a region declared as ELF header. VIT and its variations (see Segment padding infection (i)) put the code into section .rodata.

So keep in mind that this chapter is very hypothetical.

13.1. Extracting sections

Let's have fun by looking at the sections of an executable file. Here comes a script that extracts a single section as raw data. readelf(1) provides a related (but useless) option.

-x <number>

--hex-dump=<number>

Displays the contents of the indicated section as a hexadecimal dump.

Source: src/suspicious_code/dumpsection.pl
#!/usr/bin/perl -sw
use strict;

$::file = '/bin/sh' if (!defined($::file));
$::section = '.text' if (!defined($::section));

my $readelf = $ENV{'TEVWH_PATH_READELF'}
|| die "Environment variable TEVWH_PATH_READELF undefined.";

open(READELF, '-|', "$readelf -S $::file") || die "readelf: $! ";
while(<READELF>)
{
  if (m/^  \[[ 0-9]+\] $::section /)
  {
    my @word = split;
    my $off = hex($word[4]);
    my $size = hex($word[5]);

    open(FILE, '<', $::file) || die "open: $!";
    sysseek(FILE, $off, 0) || die "seek: $!";
    my $dump;
    sysread(FILE, $dump, $size) || die "read: $!";
    close FILE;
    syswrite STDOUT, $dump;
  }
}
close READELF;

And the first test is simple. Compare the following output with gdb(1)'s dump in The entry point.

Command: pre/i386-redhat8.0-linux/suspicious_code/intel.sh
#!/bin/bash
TEVWH_PATH_READELF=/usr/bin/readelf
export TEVWH_PATH_READELF

./src/suspicious_code/dumpsection.pl \
	-file=/bin/bash -section=.text \
| /usr/bin/ndisasm -u - \
| /usr/bin/perl -ne "print $_; exit if m/\b(ret|hlt)\b/;"

Output: out/i386-redhat8.0-linux/suspicious_code/disasm
00000000  31ED              xor ebp,ebp
00000002  5E                pop esi
00000003  89E1              mov ecx,esp
00000005  83E4F0            and esp,byte -0x10
00000008  50                push eax
00000009  54                push esp
0000000A  52                push edx
0000000B  6888360C08        push dword 0x80c3688
00000010  68649C0508        push dword 0x8059c64
00000015  51                push ecx
00000016  56                push esi
00000017  68F0A70508        push dword 0x805a7f0
0000001C  E8BBFBFFFF        call 0xfffffbdc
00000021  F4                hlt