1. Variables and packages

 

Once it hits the fan, the only rational choice is to sweep it up, package it, and sell it as fertilizer.

 anonymous

This document tries to cover multiple platforms through conditional compilation. There is a configure.pl that determines the host type and sets up a config.sh containing environment variable definitions. There also are equivalent config.csh, config.h, config.mak, config.sed, and config.xml. The Makefile then uses individual sub-directories for each platform. The name of these directories (and some other platform specific values) is retrieved through environment variables. The directory structure is not without meaning.

1.1. Variables prefixed with TEVWH_

The files in src/ are obfuscated with obscene amounts of variable references like ${TEVWH_ELF_BASE} or even ${TEVWH_PATH_LS}. I admit that using variables instead of plain program names makes shell scripts harder to read. But this is necessary to maintain a minimum level of reproducibility on SunOS. Anyway, directory pre/ makes that 10000 and /bin/ls. You will encounter this syntax nightmare only in a few places. An almost complete list of used variables is given below.

Note that hexadecimal shell variables actually miss the leading 0x to simplify calculations with bc. These values are also available to C code through corresponding #define statements after #include <config.h>. Values are not quoted, but hexadecimal values are correctly prefixed by 0x.

1.2. Variables prefixed with TEVWH_PATH_

1.3. The name of the X

The value of LANG is not directly related. But some tools create strange output for en_US.UTF-8.

While most Linux distributions ship with slightly modified kernels, no vendor has ever dared to mess with the values returned by uname(2). Instead the tradition of distribution dependent text files in directory /etc was established.

1.4. The owner of files

One of the lesser known features of package management is self-reflection. How do we determine the package owning a file if we have the canonical path name?

Debian GNU/Linux 2.2 uses dpkg for package management. It maintains a set of loosely indexed text files in /var/lib/dpkg/. The whole thing is not well suited for our kind of query.

The first half of a simple example does a linear search trough /var/lib/dpkg/info/*.list:

To create the table shown in the abstract a second query is required. This one does a linear search through one huge text file, /var/lib/dpkg/status.

Source: pre/sparc-debian2.2-linux/packages/deb/status.sh
#!/bin/bash
/usr/bin/dpkg -s sed
/bin/echo status=$?

Output: out/sparc-debian2.2-linux/packages/deb/status
Package: sed
Essential: yes
Status: install ok installed
Priority: required
Section: base
Installed-Size: 216
Maintainer: Wichert Akkerman <wakkerma@debian.org>
Version: 3.02-5
Pre-Depends: libc6 (>= 2.1.2)
Description: The GNU sed stream editor.
 sed reads the specified files or the standard input if no
 files are specified, makes editing changes according to a
 list of commands, and writes the results to the standard
 output.

status=0

1.5. The owner of /usr/bin/perl

But this is not the end of the story. A particularly absurd example is perl. A chain of symbolic links is not a problem in itself. But what shall we do if neither the links nor the final target are registered?

The solution to the puzzle is a hard link. stat(2) tells how many names refer to the same file. But to actually find these names the complete file system has to be searched, similar to find -xdev -inum. [1] In practice one can assume that this kind of hard link is located in the same directory. Not a guaranteed or fast solution, but manageable.

Source: pre/sparc-debian2.2-linux/packages/deb/hard.sh
#!/bin/bash
file=$( which perl )
while true; do
  ls=$( /bin/ls -i ${file} )
  file=$( /bin/readlink ${file} ) || break
done

inum=${ls%%/*}
file=/${ls#*/}
dir=${file%/*}

files=$( /bin/ls -i ${dir} \
	| /bin/grep "${inum}" \
	| /bin/sed "s#.* #${dir}/#" )
cmd="/usr/bin/dpkg -S '${files}'"
/bin/echo ${cmd}
${cmd}
/bin/echo status=$?

Output: out/sparc-debian2.2-linux/packages/deb/hard
/usr/bin/dpkg -S '/usr/bin/perl-5.005 /usr/bin/perl5.005 /usr/bin/perl5.00503'
dpkg: *'/usr/bin/perl-5.005* not found.
perl-5.005-base: /usr/bin/perl5.005
dpkg: /usr/bin/perl5.00503' not found.
status=0

1.6. The source of man-pages

Option -a of man returns all matching entries, not just the lowest section. This behavior is identical between platforms.

Requesting a specific section requires option -s section on SunOS, while Linux and FreeBSD prefer a plain section.

1.7. Verifying installed packages

This chapter is not about checking the integrity of package files. See Intrusion detection systems (i) for a general introduction.

debsums(1) lets you verify everything or complete packages. Option -s is described as "Be silent. Just report problems."

Command: pre/sparc-debian2.2-linux/packages/deb/verify.sh
#!/bin/bash
/usr/bin/debsums -s bash
/bin/echo status=$?
/usr/bin/debsums -s gcc
/bin/echo status=$?

Output: out/sparc-debian2.2-linux/packages/verify
Package bash did not come with checksums
status=0
status=0

To verify against the checksums included in a package file, e.g. on the installation CD, instead of possibly corrupted database just specify the package file instead of the package name.

Notes

[1]

Finding the mount point of the file system holding an arbitrary directory is tricky in itself. Field st_dev of the struct stat returned by stat(2) unambiguously identifies a mounted file system. Repeatedly changing into the parent directory until its value of st_dev is different from that of the starting point should find the mount point.