3. Implementation details

  1. 3.1. Implementation of types
  2. 3.2. Support routines
  3. 3.3. Run-time checking

3.1. Implementation of types

The C implementation of the type system specified above is based on a single type, name, with the same name as the input algebra. This is defined as follows:

typedef union name_tag {
    unsigned int ag_tag;
    union name_tag *ag_ptr;
    unsigned ag_enum;
    unsigned long ag_long_enum;
    name_dim ag_dim;
    t ag_prim_type;
    ....
} name;

where t runs over all the primitive types. All of the types in the algebra can be packed into a block of cells of type name. The following paragraphs describe the implementation of each type, together with how they are packed as blocks of cells. This is illustrated by the following diagram:

prim enum ptr tag dim PRIM = ENUM = STRUCT = UNION = PTR = LIST = VEC = VEC_PTR =
Figure 1. Packing of types in calculus

Primitive types are implemented by a typedef defining the type in terms of its given definition. A primitive type can be packed into a single cell using the appropriate ag_prim_type field of the union.

Identity types are implemented by a typedef defining the type as being equal to another type from the algebra.

Enumeration types are all implemented as unsigned int if all its values fit into 16 bits, or unsigned long otherwise. An enumeration type can be packed into a single cell using the ag_enum or ag_long_enum field of the union.

Structure types are implemented by a typedef defining the type to be the C structure with the given components. A structure type may be packed into a block of cells by packing each of the components in turn.

Union types are all implemented by a pointer to name. This pointer references a block of cells. The null pointer represents NULL_union, otherwise the first cell contains the union discriminant tag (in the ag_tag field), with the subsequent cells containing the packed field components (shared components first, then unshared components). If the union has only one field then the discriminant can be omitted. The union itself can be packed into a single cell using the ag_ptr field of the union.

Pointer types are all implemented by a pointer to name. Null pointers represent NULL_ptr, otherwise the pointer references a single cell. This cell contains a pointer to the packed version of the object being pointed to in its ag_ptr field. A pointer type itself can be packed into a single cell using the ag_ptr field of the union.

List types are all implemented by a pointer to name. Null pointers represent NULL_list, otherwise the pointer references a block of two cells. The first cell gives the tail of the list in its ag_ptr field; the second cell contains a pointer to the packed version of the head of the list in its ag_ptr field. A list type itself can be packed into a single cell using the ag_ptr field of the union.

Stack types are identical to list types, with the head of the list corresponding to the top of the stack.

Vector pointer and vector types are all implemented by structures defined as follows:

typedef unsigned int name_dim;

typedef struct {
    name *vec;
    name *ptr;
} name_VEC_PTR;

typedef struct {
    name_dim dim;
    name_VEC_PTR elems;
} name_VEC;

The vec field of a vector pointer contains a pointer to a block of cells containing the packed versions of the elements of the vector. The ptr field is a pointer to the current element of this block. A vector type also has a field giving the vector size. A vector pointer type can be packed into a block of two cells, using the ag_ptr field of each to hold its two components. A vector type can similarly be packed into a block of three cells, the first holding the vector size in its ag_dim field.

All size types are implemented as int.

3.2. Support routines

The implementation requires the user to provide certain support functions. Most fundamental are the routines for allocating and deallocating the blocks of cells which are used to store the types. These are specified as follows:

name *gen_name(unsigned int);
void destroy_name(name *, unsigned int);
void dummy_destroy_name(name *, unsigned int);

where gen_name allocates a block of cells of the given size, destroy_name deallocates the given block, and dummy_destroy_name has no effect. The precise details of how the memory management is to be handled are left to the user.

Certain generic list functions must also be provided, namely:

void destroy_name_list(name *, unsigned int);
name *reverse_name_list(name *);
name *append_name_list(name *, name *);
name *end_name_list(name *);

which implement the constructs DESTROY_list, REVERSE_list, APPEND_list and END_list respectively.

Finally a dummy empty vector:

name_VEC empty_name_vec;

needs to be defined.

Examples of these support routines can be found in the calculus program itself.

3.3. Run-time checking

The type checking facilities supported by calculus allow for compile-time detection of many potential programming errors, however there are many problems, such as dereferencing null pointers, deconstructing empty lists and union tag errors which can only be detected at run-time. For this reason calculus can be made to add extra assertions to check for such errors into its output. This is done by specifying the -a command-line option.

These assertions are implemented by means of macros. If the standard macro NDEBUG is not defined then these macros expand to give the run-time checks. If not the output is exactly equivalent to that which would have been output if the -a option had not been given.

The assertions require certain support functions which are output into a separate file, assert_def.h. These functions need to be compiled into the program when NDEBUG is not defined. Note that the functions are implementation specific.