5. Statistics

  1. 5.1. Statistics for Linux APIs
  2. 5.2. Statistics concerning changes in the original source code

5.1. Statistics for Linux APIs

All the files required to create an API are located inside a special sub-tree of the TenDRA [4.x] distribution named src/apis/. First, the new API must be specified, in a dedicated language (the TenDRA tspec tool is provided to translate such API specification into a target-independent intermediate format). An API specification is split into files, each of which usually corresponding to a system header file available on the target platform(s); e.g.: stdio.h, sys/types.h, etc.

Secondly, there is often the need for slight changes within some of the system header files of a given target, in order to build (to install) the new API on it; such changes are either hard-coded inside a replacement version for the relevant headers, or automatically performed using scripts of text processing commands; the sed tool, along with sed scripts, are used to deal with common APIs (for UNIX-like systems) included in current TenDRA distributions. We think that the second method is more flexible; for example, the text processing commands to apply on the system header files in order to build successfully our custom APIs for Linux were in most cases the same among Linux/i386 and Linux/Alpha targets, so we have written a single set of sed scripts suitable for both of these platforms.

The following table gives an idea of the amount of work required to specify and to build the APIs used by the ~230 Linux commands which were ported to ANDF. Two types of data are reported: “number of files” and “number of lines”. The files are modules holding lines of either API specification or of sed commands. We excluded from our statistics the comment lines for both, because the ratio of comments versus actual tspec/sed constructs was unusually high.

We recall that our Base API for Linux was derived from existing “standard” APIs, i.e. mainly from XPG3 and SVID3, which themselves use parts of POSIX. Consequently, the number of tspec lines for this API which is shown below does not reflect the actual amount of lines of specification having been used: most of the specifications we wrote are similar to C language #include statements (tspec +IMPLEMENT or +USE constructs).

In our “Extension API 1”, we have not uncommonly rewritten specifications which had an equivalent (either identical or similar) within the reference SVID3 API or SPEC1070 API.

We recall also that we did not start from scratch when making the changes in Linux system headers in order to build our APIs, either for Linux/i386 or Linux/Alpha targets: the TenDRA Snapshot we started from (April 95) provided 11 sed scripts for this purpose, that were sufficient to build most of the XPG3 API for the Linux/i386 target.

The number of files given for “Changes in system headers” is the one for the 1st target, Linux/i386 (1.0): as already mentioned, the build of APIs for the 2nd target, Linux/Alpha (1.3), uses the same set of sed scripts. This means that the number of lines corresponds to the sed commands which apply either to a Linux/Intel system header, to a Linux/Alpha header or to both.

Base API, specsExtension API1, specsExtension API2 (X11), specsChanges in system headers (build)
files65 (.h)66 (.h)3 (.h)48 (sed scripts)
lines (excl. comments)138168418254
Table 1. APIs for Linux / ~230 commands

5.2. Statistics concerning changes in the original source code

The following table lists the packages we installed and validated on both platforms. The first column gives the (Slackware Linux) name of the package, and the second gives the number of source files we dealt with during the ANDFization. The following column gives the number of files - actual source or Makefile - patched during either the initial ANDFization (keeping the i386 as target), or the port to the 2nd target (Alpha). The two last columns show the number of files specifically (re)patched during the port to Alpha, and finally the number of patched Makefiles (that for the overall project).

PackagesSource filesPatches TotalPatches Alpha portPatches Makefiles
bin455 [a] 86196
sh_util721430
txtutils50821
util71 [b]51223
diff321030
gzip36411
grep16741
find571220
bc20200
tar371021
rcs28600
byacc20800
m434100
less44200
flex46300
ispell4021151
elm1702740
Total1233272 (22%)7714
Table 2. Packages ported to Linux/i386 and Linux/Alpha
  1. [a]

    Contents of the bin package were partially ANDFized: 48 commands out of 56. Also, a subset only of these ANDFized commands was actually ported to Linux/Alpha: 37 (out of 48). So, the “number of source files” shown excludes source code for non-ANDFized commands. Conversely, the “number of patches, Alpha” would be greater if all the ANDFized commands had been ported to the second target.

  2. [b]

    Contents of the Slackware Linux util package were partially ANDFized: 35 commands out of 57. Among these, only 30 were actually ported to Linux/Alpha.

The following table lists the packages we installed and validated on the Linux/i386 platform only. Similar information, except for the number of patches for the Linux/Alpha platform, is given.

PackagesSource filesPatches (i386 only)Patches Makefiles
aaa_base122222
ash49151
bsdgames47 [c]3511
cpio44221
getty1861
joe87260
perl86101
ps3561
sudo960
tcpip/net-tools26 [d]120
xgames105283
xlock3741
Total66517522
  1. [c]

    The bsdgames package was partially ported to ANDF

  2. [d]

    In the tcpip package, only the Net-tools subset was ported.

From the data shown in the table above , we can say that about 80% of the original source files did not require any change when ported to ANDF. We find also that we encountered, during the initial ANDFization and the validation on the 1st platform, more than 70% of the files requiring changes. In other words, we had missed approximately 30% of required changes. Note that in our project both of the platforms were running the same UNIX-like variant (Linux), so the last ratio could be greater in a less favorable case. Finally, since the Intel/i386 and Digital/Alpha feature different sizes for pointer type (and long type), while the byte ordering is the same, there is probably still incorrect code lying inside our ANDF files for Linux commands.