Once it hits the fan, the only rational choice is to sweep it up, package it, and sell it as fertilizer. | |
anonymous |
This document tries to cover multiple platforms through conditional compilation. There is a configure.pl that determines the host type and sets up a config.sh containing environment variable definitions. There also are equivalent config.csh, config.h, config.mak, config.sed, and config.xml. The Makefile then uses individual sub-directories for each platform. The name of these directories (and some other platform specific values) is retrieved through environment variables. The directory structure is not without meaning.
src/ is for source code, i.e. text files written and maintained by humans.
pre/sparc-debian2.2-linux/ is directory src/ pre-processed with config.sed. In pre/ all program names are absolute. Magic numbers and platform specific constants are verbatim.
tmp/sparc-debian2.2-linux/ is the only place to hold binaries, i.e. executables and .o files.
out/sparc-debian2.2-linux/ is for the output of executables, hex dumps, disassembly listings, text processing.
The files in src/ are obfuscated with obscene amounts of variable references like ${TEVWH_ELF_BASE} or even ${TEVWH_PATH_LS}. I admit that using variables instead of plain program names makes shell scripts harder to read. But this is necessary to maintain a minimum level of reproducibility on SunOS. Anyway, directory pre/ makes that 10000 and /bin/ls. You will encounter this syntax nightmare only in a few places. An almost complete list of used variables is given below.
Table 1. Variables prefixed with TEVWH_
Variable name | Value on this platform |
---|---|
CFLAGS | -Wall -O1 -I . -I out/sparc-debian2.2-linux -D NDEBUG |
OUT_XML | out/sparc-debian2.2-linux/xml |
Variable name | Value on this platform | Variable name | Value on this platform |
---|---|---|---|
AFLAGS | -I . -D _ASM | ARCH | sparc |
ASM | sparc_Linux_att | ASM_COMMENT | ! |
ASM_FLAVOR | ASM_OBJDUMP | ||
ASM_RETURN | (restore|unimp) | ASM_STYLE | att |
BYTE_ORDER | M | ELF_ADDR | Elf32_Addr |
ELF_ADDR_SIZE | 32 | ELF_ALIGN | 10000 |
ELF_BASE | 10000 | ELF_EHDR | Elf32_Ehdr |
ELF_MAGIC | 10001 | ELF_OFF | Elf32_Off |
ELF_PAGE_SIZE | 1000 | ELF_PHDR | Elf32_Phdr |
ELF_SHDR | Elf32_Shdr | HOSTTYPE | Linux/sparc |
OS_CODE | sparc-debian2.2-linux | OS_NAME | Debian GNU/Linux 2.2 |
OS_PKG_SYS | deb | OS_VENDOR | debian |
OS_VERSION | 2.2 | OUT | out/sparc-debian2.2-linux |
PRE | pre/sparc-debian2.2-linux | PROC_EXE | /proc/self/exe |
PROC_MEM | /proc/self/mem | TMP | tmp/sparc-debian2.2-linux |
UNAME | Linux |
Note that hexadecimal shell variables actually miss the leading 0x to simplify calculations with bc. These values are also available to C code through corresponding #define statements after #include <config.h>. Values are not quoted, but hexadecimal values are correctly prefixed by 0x.
Table 2. Variables prefixed with TEVWH_PATH_
Variable name | Value on this platform | Variable name | Value on this platform |
---|---|---|---|
BASH | /bin/bash | BC | /usr/bin/bc |
CAT | /bin/cat | CC | /usr/bin/gcc |
CHMOD | /bin/chmod | CSH | /usr/bin/tcsh |
CUT | /usr/bin/cut | DD | /bin/dd |
DEBSUMS | /usr/bin/debsums | DISTID | /etc/debian_version |
DPKG | /usr/bin/dpkg | DU | /usr/bin/du |
ECHO | /bin/echo | EXPAND | /usr/bin/expand |
FILE | /usr/bin/file | FIND | /usr/bin/find |
FMT | /usr/bin/fmt | GDB | /usr/bin/gdb |
GREP | /bin/grep | HEXDUMP | /usr/bin/hexdump |
KILL | /bin/kill | LD | /usr/bin/ld |
LDD | /usr/bin/ldd | LS | /bin/ls |
MAKE | /usr/bin/make | MAN | /usr/bin/man |
NICE | /usr/bin/nice | NM | /usr/bin/nm |
OBJDUMP | /usr/bin/objdump | OD | /usr/bin/od |
PERL | /usr/bin/perl-5.005 | READELF | /usr/bin/readelf |
READLINK | /bin/readlink | SED | /bin/sed |
SH | /bin/bash | SORT | /usr/bin/sort |
STRACE | /usr/bin/strace | STRINGS | /usr/bin/strings |
STRIP | /usr/bin/strip | TAIL | /usr/bin/tail |
TEE | /usr/bin/tee | TR | /usr/bin/tr |
UNIQ | /usr/bin/uniq | WC | /usr/bin/wc |
XARGS | /usr/bin/xargs | XXD | /usr/bin/xxd |
Command: src/packages/uname.sh
#!/bin/sh
uname -mprs
echo "[${HOSTTYPE}]"
echo "[${VENDOR}]"
echo "[${OSTYPE}]"
echo "[${MACHTYPE}]"
echo "[${LANG}]" |
The value of LANG is not directly related. But some tools create strange output for en_US.UTF-8.
Output: out/sparc-debian2.2-linux/packages/uname
Linux 2.2.19 sparc unknown
[sparc]
[unknown]
[linux-gnu]
[sparc-unknown-linux-gnu]
[C] |
While most Linux distributions ship with slightly modified kernels, no vendor has ever dared to mess with the values returned by uname(2). Instead the tradition of distribution dependent text files in directory /etc was established.
Command: pre/sparc-debian2.2-linux/packages/distid.sh
#!/bin/bash
# We need this script to copy the id-file into directory out/.
# I use many machines to test examples, but only one to render the document.
/bin/cat /etc/debian_version |
Output: out/sparc-debian2.2-linux/packages/distid
2.2 |
One of the lesser known features of package management is self-reflection. How do we determine the package owning a file if we have the canonical path name?
Debian GNU/Linux 2.2 uses dpkg for package management. It maintains a set of loosely indexed text files in /var/lib/dpkg/. The whole thing is not well suited for our kind of query.
Source: pre/sparc-debian2.2-linux/packages/deb/du.sh
#!/bin/bash
/usr/bin/file /var/lib/dpkg/* | /bin/grep -v yesterday
/usr/bin/du -s /var/lib/dpkg/ |
Output: out/sparc-debian2.2-linux/packages/du
/var/lib/dpkg/alternatives: directory
/var/lib/dpkg/available: English text
/var/lib/dpkg/available-old: English text
/var/lib/dpkg/cmethopt: ASCII text
/var/lib/dpkg/diversions: empty
/var/lib/dpkg/info: directory
/var/lib/dpkg/lock: empty
/var/lib/dpkg/methlock: empty
/var/lib/dpkg/methods: directory
/var/lib/dpkg/status: English text
/var/lib/dpkg/status-old: English text
/var/lib/dpkg/updates: directory
10578 /var/lib/dpkg |
The first half of a simple example does a linear search trough /var/lib/dpkg/info/*.list:
Source: pre/sparc-debian2.2-linux/packages/deb/simple.sh
#!/bin/bash
/usr/bin/dpkg -S $( which sed ) |
Output: out/sparc-debian2.2-linux/packages/simple
sed: /bin/sed |
To create the table shown in the abstract a second query is required. This one does a linear search through one huge text file, /var/lib/dpkg/status.
Source: pre/sparc-debian2.2-linux/packages/deb/status.sh
#!/bin/bash
/usr/bin/dpkg -s sed
/bin/echo status=$? |
Output: out/sparc-debian2.2-linux/packages/deb/status
Package: sed
Essential: yes
Status: install ok installed
Priority: required
Section: base
Installed-Size: 216
Maintainer: Wichert Akkerman <wakkerma@debian.org>
Version: 3.02-5
Pre-Depends: libc6 (>= 2.1.2)
Description: The GNU sed stream editor.
sed reads the specified files or the standard input if no
files are specified, makes editing changes according to a
list of commands, and writes the results to the standard
output.
status=0 |
But this is not the end of the story. A particularly absurd example is perl. A chain of symbolic links is not a problem in itself. But what shall we do if neither the links nor the final target are registered?
Source: pre/sparc-debian2.2-linux/packages/deb/perl.sh
#!/bin/bash
file=$( which perl )
cmd="/usr/bin/dpkg -S ${file}"
while file=$( /bin/readlink ${file} ); do
cmd="${cmd} ${file}"
done
/bin/echo ${cmd}
${cmd} |
Output: out/sparc-debian2.2-linux/packages/deb/perl
/usr/bin/dpkg -S /usr/bin/perl /etc/alternatives/perl /usr/bin/perl-5.005
dpkg: /usr/bin/perl not found.
dpkg: /etc/alternatives/perl not found.
dpkg: /usr/bin/perl-5.005 not found. |
The solution to the puzzle is a hard link. stat(2) tells how many names refer to the same file. But to actually find these names the complete file system has to be searched, similar to find -xdev -inum. [1] In practice one can assume that this kind of hard link is located in the same directory. Not a guaranteed or fast solution, but manageable.
Source: pre/sparc-debian2.2-linux/packages/deb/hard.sh
#!/bin/bash
file=$( which perl )
while true; do
ls=$( /bin/ls -i ${file} )
file=$( /bin/readlink ${file} ) || break
done
inum=${ls%%/*}
file=/${ls#*/}
dir=${file%/*}
files=$( /bin/ls -i ${dir} \
| /bin/grep "${inum}" \
| /bin/sed "s#.* #${dir}/#" )
cmd="/usr/bin/dpkg -S '${files}'"
/bin/echo ${cmd}
${cmd}
/bin/echo status=$? |
Output: out/sparc-debian2.2-linux/packages/deb/hard
/usr/bin/dpkg -S '/usr/bin/perl-5.005 /usr/bin/perl5.005 /usr/bin/perl5.00503'
dpkg: *'/usr/bin/perl-5.005* not found.
perl-5.005-base: /usr/bin/perl5.005
dpkg: /usr/bin/perl5.00503' not found.
status=0 |
Option -a of man returns all matching entries, not just the lowest section. This behavior is identical between platforms.
Command: pre/sparc-debian2.2-linux/packages/man-all/Linux.sh
#!/bin/bash
/usr/bin/man -a -w kill |
Output: out/sparc-debian2.2-linux/packages/man-all
/usr/share/man/man1/kill.1.gz
/usr/share/man/man2/kill.2.gz |
Requesting a specific section requires option -s section on SunOS, while Linux and FreeBSD prefer a plain section.
Command: pre/sparc-debian2.2-linux/packages/man-section/Linux.sh
#!/bin/bash
/usr/bin/man -w 2 kill |
Output: out/sparc-debian2.2-linux/packages/man-section
/usr/share/man/man2/kill.2.gz |
This chapter is not about checking the integrity of package files. See Intrusion detection systems (i) for a general introduction.
debsums(1) lets you verify everything or complete packages. Option -s is described as "Be silent. Just report problems."
Command: pre/sparc-debian2.2-linux/packages/deb/verify.sh
#!/bin/bash
/usr/bin/debsums -s bash
/bin/echo status=$?
/usr/bin/debsums -s gcc
/bin/echo status=$? |
Output: out/sparc-debian2.2-linux/packages/verify
Package bash did not come with checksums
status=0
status=0 |
To verify against the checksums included in a package file, e.g. on the installation CD, instead of possibly corrupted database just specify the package file instead of the package name.
[1] | Finding the mount point of the file system holding an arbitrary directory is tricky in itself. Field st_dev of the struct stat returned by stat(2) unambiguously identifies a mounted file system. Repeatedly changing into the parent directory until its value of st_dev is different from that of the starting point should find the mount point. |