1. Variables & packages

 

Once it hits the fan, the only rational choice is to sweep it up, package it, and sell it as fertilizer.

 anonymous

This document tries to cover multiple platforms through conditional compilation. There is a configure.pl that determines the host type and sets up a config.sh containing environment variable definitions. There also are equivalent config.csh, config.h, config.mak, config.sed, and config.xml. The Makefile then uses individual sub-directories for each platform. The name of these directories (and some other platform specific values) is retrieved through environment variables. The directory structure is not without meaning.

The files in src/ are obfuscated with obscene amounts of variable references like ${TEVWH_ELF_BASE} or even ${TEVWH_PATH_LS}. I admit that using variables instead of plain program names makes shell scripts harder to read. But this is necessary to maintain a minimum level of reproducibility on SunOS. Anyway, directory pre/ makes that 10000 and /bin/ls, so you will probably never encounter this syntax nightmare. An almost complete list of used variables is given below.

Table 1. Variables prefixed with TEVWH_

Variable nameValue on this platformVariable nameValue on this platform
ARCHsparcASM_COMMENT!
ASM_FLAVORASM_RETURN\(restore\|unimp\)
ASM_STYLEattBYTE_ORDERM
CFLAGS-O1 -I out/sparc-debian2.2-linux -D NDEBUGELF_ADDR_SIZE32
ELF_ALIGN0x10000ELF_BASE0x10000
ELF_EHDRElf32_EhdrELF_MAGIC0x10001
ELF_PHDRElf32_PhdrELF_SHDRElf32_Shdr
HOSTTYPELinux/sparcOS_CODEsparc-debian2.2-linux
OS_NAMEDebian GNU/Linux 2.2OS_PKG_SYSdeb
OS_VENDORdebianOS_VERSION2.2
OUTout/sparc-debian2.2-linuxOUT_XMLout/sparc-debian2.2-linux/xml
PAGESIZE0x1000PREpre/sparc-debian2.2-linux
PROC_EXE/proc/self/exePROC_MEM/proc/self/mem
TMPtmp/sparc-debian2.2-linuxUNAMELinux

Note that hexadecimal shell variables actually miss the leading 0x to simplify calculations with bc. These values are also available to C code through corresponding #define statements after #include <config.h>. Values are not quoted, but hexadecimal values are correctly prefixed by 0x.

Table 2. Variables prefixed with TEVWH_PATH_

Variable nameValue on this platformVariable nameValue on this platform
BC/usr/bin/bcCC/usr/bin/gcc
CHMOD/bin/chmodCOL/usr/bin/col
CSH/usr/bin/tcshCUT/usr/bin/cut
DD/bin/ddDEBSUMS/usr/bin/debsums
DPKG/usr/bin/dpkgDU/usr/bin/du
ECHO/bin/echoEXPAND/usr/bin/expand
FILE/usr/bin/fileFIND/usr/bin/find
GDB/usr/bin/gdbGREP/bin/grep
HEAD/usr/bin/headHEXDUMP/usr/bin/hexdump
KILL/bin/killLD/usr/bin/ld
LDD/usr/bin/lddLS/bin/ls
MAKE/usr/bin/makeMAN/usr/bin/man
MT/bin/mtNICE/usr/bin/nice
NM/usr/bin/nmOBJDUMP/usr/bin/objdump
OD/usr/bin/odPERL/usr/bin/perl
READELF/usr/bin/readelfREADLINK/bin/readlink
SED/bin/sedSH/bin/sh
SORT/usr/bin/sortSTRACE/usr/bin/strace
STRINGS/usr/bin/stringsSTRIP/usr/bin/strip
TAIL/usr/bin/tailTEE/usr/bin/tee
TR/usr/bin/trUNIQ/usr/bin/uniq
WHICH/usr/bin/whichXARGS/usr/bin/xargs
XXD/usr/bin/xxd  

1.1. The owner of files

One of the lesser known features of package management is self-reflection. How do we determine the package owning a file if we have the canonical path name?

Debian GNU/Linux 2.2 uses dpkg for package management. It maintains a set of loosely indexed text files in /var/lib/dpkg/. The whole thing is not well suited for our kind of query.

The first half of a simple example does a linear search trough /var/lib/dpkg/info/*.list:

To create the table shown in the abstract a second query is required. This one does a linear search through one huge text file, /var/lib/dpkg/available.

Source: pre/sparc-debian2.2-linux/packages/deb/avail.sh
#!/bin/sh
/usr/bin/dpkg --print-avail sed
/bin/echo status=$?

Output: out/sparc-debian2.2-linux/packages/deb/avail
Package: sed
Essential: yes
Priority: required
Section: base
Installed-Size: 216
Maintainer: Wichert Akkerman <wakkerma@debian.org>
Architecture: sparc
Version: 3.02-5
Pre-Depends: libc6 (>= 2.1.2)
Filename: dists/potato/main/binary-sparc/base/sed_3.02-5.deb
Size: 68676
MD5sum: 330aeae88d39d63bbf2488b4ee1a004a
Description: The GNU sed stream editor.
 sed reads the specified files or the standard input if no
 files are specified, makes editing changes according to a
 list of commands, and writes the results to the standard
 output.

status=0

But this is not the end of the story. A particularly absurd example is perl. A chain of symbolic links is not a problem in itself. But what shall we do if neither the links nor the final target are registered?

Source: pre/sparc-debian2.2-linux/packages/deb/perl.sh
#!/bin/sh
file=$( /usr/bin/which perl )
cmd="/usr/bin/dpkg -S ${file}"

while file=$( /bin/readlink ${file} ); do
  cmd="${cmd} ${file}"
done

/bin/echo ${cmd}
${cmd}

Output: out/sparc-debian2.2-linux/packages/deb/perl
/usr/bin/dpkg -S /usr/bin/perl /etc/alternatives/perl /usr/bin/perl-5.005
dpkg: /usr/bin/perl not found.
dpkg: /etc/alternatives/perl not found.
dpkg: /usr/bin/perl-5.005 not found.

The solution to the puzzle is a hard link. stat(2) tells how many names refer to the same file. But to actually find these names the complete file system has to be searched, similar to find -xdev -inum. [1] In practice one can assume that this kind of hard link is located in the same directory. Not a guaranteed or fast solution, but manageable.

Source: pre/sparc-debian2.2-linux/packages/deb/hard.sh
#!/bin/sh
file=$( /usr/bin/which perl )
while true; do
  ls=$( /bin/ls -i ${file} )
  file=$( /bin/readlink ${file} ) || break
done

inum=${ls%%/*}
file=/${ls#*/}
dir=${file%/*}

files=$( /bin/ls -i ${dir} \
	| /bin/grep "${inum}" \
	| /bin/sed "s#.* #${dir}/#" )
cmd="/usr/bin/dpkg -S '${files}'"
/bin/echo ${cmd}
${cmd}
/bin/echo status=$?

Output: out/sparc-debian2.2-linux/packages/deb/hard
/usr/bin/dpkg -S '/usr/bin/perl-5.005 /usr/bin/perl5.005 /usr/bin/perl5.00503'
dpkg: *'/usr/bin/perl-5.005* not found.
perl-5.005-base: /usr/bin/perl5.005
dpkg: /usr/bin/perl5.00503' not found.
status=0

1.2. The source of man-pages

Option -a of man returns all matching entries, not just the lowest section. This behavior is identical between platforms.

Requesting a specific section requires option -s section on SunOS, while Linux prefers a plain section.

1.3. Verifying installed packages

This chapter is not about checking the integrity of package files. See Intrusion detection systems for a general introduction.

debsums(1) lets you verify everything or complete packages. Option -s is described as "Be silent. Just report problems."

Command: pre/sparc-debian2.2-linux/packages/deb/verify.sh
#!/bin/sh
/usr/bin/debsums -s bash
/bin/echo status=$?
/usr/bin/debsums -s gcc
/bin/echo status=$?

Output: out/sparc-debian2.2-linux/packages/verify
Package bash did not come with checksums
status=0
status=0

To verify against the checksums included in a package file, e.g. on the installation CD, instead of possibly corrupted database just specify the package file instead of the package name.

Notes

[1]

Finding the mount point of the file system holding an arbitrary directory is tricky in itself. Field st_dev of the struct stat returned by stat(2) unambiguously identifies a mounted file system. Repeatedly changing into the parent directory until its value of st_dev is different from that of the starting point should find the mount point.