2. Indicating API Conformance for a New Machine

  1. 2.1. Implementation Steps
  2. 2.2. Abstract versus Concrete APIs
  3. 2.3. Order of Inclusion
  4. 2.4. Versioning (of Operating Systemes and APIs)
  5. 2.5. Omissions (marking things __WRONG)
  6. 2.6. Splitting off new Subsets
  7. 2.7. Workarounds for Deviations
  8. 2.8. Notes on -Ysystem

TODO: what API conformance is. link to main documents

TODO: why you need to lay down API conformance stuff for new machines

This chapter explains how to state to TenDRA which APIs your system conforms to, and which it does not. This process refers to the database of known APIs (defined using tspec), and deals with handling discreprancies between that database and your system.

It is possible that your system claims conformance to more modern APIs than the database currently specifies; in this case the database would need that API added, first. See the Adding new APIs with tspec developer guide for details of that process. The rest of this document assumes that the (abstract) APIs in question are already present in that database.

2.1. Implementation Steps

TODO: introduce this document first from the perspective of everything going smoothly, then show how to deal with each kind of problem (e.g. not providing an API at all, not providing some bits, hacked includes, hacked includes for building specific APIs, etc)

Here's what you need to do:

  1. TODO: add startup headers per API. have them include each other for inheritence. TODO: explain where you can find a list of APIs

  2. Find out which APIs your machine intends to provide. For each start-up header, mark API which are not provided as __WRONG_<API>.

  3. add start-up headers (see src/lib/machines/netbsd/x86_32/startup/$api.h) (one per api) put #define __WRONG_<API>_<HEADER>_H here, for standard headers this machine cannot provide. We need to define POSIX feature-requirement macros (like _POSIX_SOURCE) here, to make the /usr/include headers conform

  4. TODO: add hacked includes if neccessary (see src/lib/machines/$os/$arch/include/*.h) add one per screwed-up-/usr/include/ header file

Special cases for workarounds are discussed in detail below.

The remainder of this document explains how to undertake these steps, and provides background information which should help them make a little more sense.

2.2. Abstract versus Concrete APIs

Is there really a difference between abstract and concrete APIs? For a specific case an API specification may only permit one single possible implementation. However in general, yes, there is a difference: APIs may permit multiple conformant implementations. Abstract APIs say things like size_t is an unsigned integer (example taken from C89), without stating exactly which type of integer that is. Whereas concrete (that is, implementations) of APIs say things like size_t is unsigned long int.

So API specifications are an abstraction of all possible implementations of that API. tspec's language expresses this abstraction, rather than any particular concrete implementation. See the Tspec guide for more on this subject.

This document is concered with indicating which abstract APIs are conformed to by the concrete implementation for one machine in particular.

When adding support for a machine, your goal is to squeeze as many APIs out of it as you can, within the set it intends to implement. TODO: explain why.

2.3. Order of Inclusion

TODO: mention that start-up headers exist (not to be confused with the other startup/ directory...) and that hacked includes exist. both explained below. Hacked includes serve as a thin wrapper over the real /usr/include headers, to provide machine-specific tweaks per API header.

The following sets of files make an apperance:

NamePurpose
System headersHeaders provided by the Operating System, in /usr/include or equivalent. These are not part of TenDRA.
Hacked includesAd-hoc workarounds to modify system headers. These are typically simply to make the system headers compilable by tcc. Hacked includes are ideally not required at all. If present, there is one per system header which needs wrapping.
tspec-generated  Source#pragma token representations of the TDF tokens present in each abstract API. These source files provide #ifdef hooks to be able to mark certian API subsets as non-conforming from the start-up files.
Start-up headersProvision of feature-enabling macros for the underlying system headers, and making of API subsets as non-conforming for omission. There is one start-up header per API for each machine.
API TDF tokensTDF representation of the supported APIs on a particular Operating System and CPU.

Some understanding of how the system works is useful in trying to work round problems. Rather than presenting system headers directly to TenDRA, relevant headers are cherry-picked per API and used to construct a TDF mapping of the abstract API to the system-specific implementations of those constructs.

See the Orientation Guide for context on how this all fits together, and what file types are involved. Here is a worked example for our purposes of porting the API building:

( make_tokdef   posix.stdio.FILE   -   shape ... ) FILE token definition C89 “FILE is a type” Transcription tspec/base/ansi/stdio.h.tspec +TYPE FILE ; POSIX “FILE is from C89” “_POSIX_SOURCE is defined” Transcription base/posix/stdio.h.tspec +IMPLEMENT "ansi", \   "stdio.h.tspec" ; Transformation tspec +IMPLEMENT $PREFIX_TSPEC/TenDRA/src/posix.api/stdio.c #define __BUILDING_TDF_POSIX_STDIO_H #ifndef _WRONG_POSIX #ifndef _WRONG_POSIX_STDIO_H #include <stdio.h> #endif #endif /usr/include/stdio.h typedef ... FILE; Production tcc -Ymakelib -D_${OSVER} osdep/machines/$os/$cpu/ startup/c89.h /* empty */ osdep/machines/$os/$cpu/ startup/posix.h #include "c89.h" #define _POSIX_SOURCE 1 #include osdep/machines/$os/$cpu/ include/$api.h #if __BUILDING_* #if _${OSVER} /* hacks go here */ #endif #endif #include_next <stdio.h> #include_next osdep/obj/machines/$os/ $cpu/api/apis/$api.api/*.j TDF linking tld c89.tl F
Figure 1. Files Involved for API Building

Headers are included in the following order:

  1. tspec-generated code. tspec generates API headers and .c files from its repository of abstract API specifications. These generated files are the entry points for building each API, and are responsible for including the machines/ headers described in the steps below.

  2. Start-up headers. These live in:

    src/lib/machines/<Operating System>/<CPU>/startup/<API>.h

    The start-up files contain the macros needed to nagivate the system headers for a particular API. Each start-up header corresponds to a single API. The start-up headers define any feature-selection macros required to make the underlying system headers present a particular API. For example, on many systems a POSIX-compatible interface may be presented by having the start-up headers define a _POSIX_SOURCE macro.

    The start-up headers also serve to state which parts of the abstract APIs the machine in question does not implement, and therfore that these parts are not to be considered. Ideally as few as possible are elided.

  3. Hacked includes, if any. A set of replacement system headers, which are checked before the real system headers, are found in:

    src/lib/machines/<Operating System>/<CPU>/include

    These serve to make almost-conformant /usr/include headers appear conformant, and to fix any small-but-compatible mistakes. #include_next the system headers from /usr/include. All this ad-hoc hackery should be per Operating System version, as per the version define (e.g. -D_NETBSD5) given from the build system. This is discussed in more detail in the versioning section of this document.

    Typically the hacked includes go on to #include_next the underlying system headers, but they need not do so.

    These are also used with the -Ysystem option to tcc. In this situation, the hacked includes are directly visible to users' code.

  4. System headers from /usr/include (or whatever location is equivalent, for that machine).

For the above paths, <Operating System> is the operating system name, <CPU> is the CPU type, and <API> is the API name.

TODO: see the orientation guide for how it all fits together wrt what generates what

2.4. Versioning (of Operating Systemes and APIs)

Unfortunately this area is very operating system dependent. It has been set up so that it works for the operating systems listed under the supported platforms, but this is not a cast iron guarantee that it will work for other versions of the same operating system.

Hence...

Operating System Versions are catered for within one machine/os/arch directory by using version-specific preprocessor guards. TODO: (#ifdef); versioning within an OS - #idef on their version for hacked includes, everything other than #include_next should be made specific to one version of an OS for start-up headers, everything __WRONG should be made specific to one version of an OS

- explain which of these have #ifdef for __NetBSD_version__ etc (or rather we should use the version macro from makedefs, for consistency with respect to other machines/)

Notes in adding *just* a new version, for an existing OS: TODO: - just need a new version? don't touch any of the existing version-specific ifdefs! (if there are none, remove them all and start from scratch)

Abstract API versions are maintained as entirely distinct APIs in tspec's repository (though perhaps inheriting from their previous equivalents). From the perspective of the machines/ headers, these simply appear as separate APIs. TODO: of the api (separate APIs inheriting). state how it manifests. e.g. workaround for posix2 to have that _max value as 255 instead of 256. probably harmless for us to reduce limits in machines/, right?

2.5. Omissions (marking things __WRONG)

Given a set of well-defined APIs and an operating system which claims conformance to them, it is required for generating portable TDF that omissions from these APIs be registered. The motivation behind this registration is explained in the TDF and Portability paper. This registration process guarentees that TDF generated will be portable between systems which implement the same API, even if their implementations differ in portions which are not used (as is usually the case).

Non-portable code (with access to all the system provides) may be generated using the special API referenced by -Ysystem. This causes API-checking to be elided, and brings in tokens from the system-headers as-is. In this case the system headers are still wrapped through the hacked includes (if present). This is the only situation in which the content of the hacked includes is directly visible to users' code (as opposed to presenting the user with TDF instead).

With this in mind, the purpose of this section is to describe the process of registering the various places in which your system omits parts of the APIs it claims to implement.

You can use __WRONG_* to mark these invalid when they are not implemented, complete nonsense, or not standard C. Find the (Tspec-generated) source file which is failing to compile. This should contain lines like:

#define __BUILDING_TDF_<API>_<HEADER>_H
#ifndef __WRONG_<API>
#ifndef __WRONG_<API>_<HEADER>_H
....
#endif
#endif

If you insert the line:

#define __WRONG_<API>_<HEADER>_H

in the corresponding API start-up file:

src/lib/machines/<Operating System>/<CPU>/startup/<API>.h

then the library builder will ignore this header. There will be a compile-time error (token not defined) if one of the features from this header is subsequently attempted to be used. If a machine does not attempt to provide a certian API at all, it makes more sense to elide that API in this way entirely, rather than marking most of its contents as __WRONG individually. TODO: explain this is a last restort, and mention the user-facing side-effects

Omissions are specified in start-up headers, and may be one of three kinds:

#define __WRONG_<SOMETHING>
  • $something is a file within an API TODO: example

  • $something is a subset within an API (e.g. for subsets, __WRONG_XPG3_SEARCH_H_SRCH_PROTO coresponds to apis/xpg3/search.h:+ SUBSET "srch_proto" := { })

  • $something is an entire API TODO: example

2.6. Splitting off new Subsets

during the process of adding new APIs, you may find... e.g. TODO: for netbsd5: #ifdef _NOT_AVAILABLE #define _POSIX_SAVED_IDS 1 - this needs to be moved out into its own API subset, so we can mark it __WRONG

TODO: look in machines/ for TODOs for things to move out into subsets

2.7. Workarounds for Deviations

Workarounds are used to provide a mechanism to modify system headers, where they are non-conformant in minor ways. portability#2.2.4.3 gives some examples of this. It is important that these workarounds are only applied when the modified system headers would still be a valid interface for the existing system libraries, otherwise incompatibilities would be introduced. Do not use work-arounds for other situations.

TODO: Workarounds (hacks) are intended for when the APIs are semantically valid, but have inappropiate (but compatible) values, minor syntatic errors, or definitions in the wrong headers and suchlike.

Ideally, these are not required at all. They provide a pragmatic option for dealing with unforeseable anomalies where the implementor is free to do whatever is neccessary in the most flexible manner possible. Importantly, these strange workarounds are kept isolated from the rest of the system. In particular, they are not visible to a user.

Conditional macros which are provided:

_NETBSD

_NETBSD for OS version (architecture and OS name are known empirically from the machines/ directory you're in)

__BUILDING_LIBS

Modifications which are specific to library building, should be enclosed in:

#ifdef __BUILDING_LIBS
.....
#endif
__BUILDING_TDF_API_$header

__BUILDING_TDF_API_something for working in a header which another header includes, letting you know who is really being built

papers/porting explains the difference, and the rationalle behind it. Several specific examples are given there, and this document will not repeat them unneccessarily.

See the appendix for examples.

TODO: mention ABI must not be made incompatible (is that the right term?)

Good places to look for inspiration on how to customise these files for your particular system include looking to see how things were done under similar circumstances. Often a problem crops up on more than one machine; we may have a workround which works on another platform which you can use for inspiration.

If you don't intend to re-distribute the TenDRA source code you also have an option which, for copyright reasons, is not available to us. You can copy the system header into the include directory above and make minor corrections directly. TODO: or maintain automatically-applied patches

TODO: mention trying your best If all else fails you can tell the library building to ignore the header. TODO: See __WRONG above.

2.8. Notes on -Ysystem

Does a user ever have good cause to use -Ysystem? I can't think of a sensible situation.

Hacked includes are used under -Ysystem, as they are for API building. However unlike API building (which presents a well-defined API of TDF tokens to the user), the hacked includes are included directly into the users' code. This is the only situation where hacked includes are visible to the users' code.

The content of the hacked includes is (mostly) enclosed in Operating System version guards on #ifdef _${OSVER}. See the section on versioning above. Recall that _${OSVER} is defined by the makedefs script, and is therfore only defined during API building, not during normal compilation of a users' program. Hence, the bulk of the hacked includes is not active under -Ysystem use.

Would you ever want to put anything in hacked includes which is outside of an _${OSVER} guard? I can't think of a situation where this would be appropiate, however the option is there if required. These would be hacks which are not specific to any particular Operating System version, and are also visible under -Ysystem. Excersise caution here, since this is visible to user code. Avoid introducing strange effects which might confuse the user when they're expecting the system headers to be used verbatim.

Here's an example C program, which uses the libpng library:

#include <png.h>

int main(void) {
	unsigned char a[8] = { 0 };
	(void) png_sig_cmp(a, 0, sizeof a);
	return 0;
}

Typical use specifies the APIs used by that library. In this case, libpng's headers include POSIX headers for its own use, and hence requires -Yposix:

clarion% tcc -Yposix -I /usr/pkg/include -L /usr/pkg/lib -lpng a.c clarion%

An alternate approach would be to write a tspec API for that library. Users may write their own tspec APIs without needing to reinstall the TenDRA programs. But, they are unlike to actually do this since it's a lot of work and the API system is unfamiliar. TODO: see tspec/tccguide for more. Then we needn't think about the library's dependencies at all, right? except maybe just for installation.

Meanwhile, back in the real world: TODO:

clarion% tcc -Ysystem -I/usr/pkg/include -Wc,-mc a.c "/usr/include/sys/cdefs.h", line 270: Error: #error "No function renaming possible"!!!! [ISO 6.8.5]: "No function renaming possible". "/usr/include/sys/bswap.h", line 20: Error: uint16_t bswap16(uint16_t) __RENAME(!!!!__bswap16) __attribute__((__const__)); [Syntax]: Parse error before '__RENAME'. [Syntax]: Can't recover from this error. clarion*

Here the system headers for NetBSD 5 make use of the (non-portable) __RENAME() facility, which causes a syntax error for tcc. I'm passing the -mc error formating option to the C producer, just so you can see the actual line of code responsible for the __RENAME() syntax error. Our hacked includes work around this with:

#ifdef _NETBSD5
#if defined(__RENAME)
#undef __RENAME
#endif
#define __RENAME(x)     /* empty */
#endif

which may be seen by sneakily defining the same _${OSVER} macro used during API building:

clarion% tcc -D_NETBSD5 -Ysystem -I /usr/pkg/include -L /usr/pkg/lib -lpng a.c clarion%

However, actually doing this is dangerous because values defined in the system headers might differ from what the user expects, having looked at their source in /usr/include. This is also inconsistent, since only some system headers are wrapped by hacked includes—that is, only the ones we needed to hack up for API building. It's not feasible to wrap them all, since the bulk of the system headers are typically system-dependant anyway.

- surely there is no reason a user would need -Ysystem? -Ysystem is only useful if tcc can read the headers. which may be true on some systems, but not on others