3. Scalar types

  1. 3.1. Arithmetic types
  2. 3.2. Integer literal types
  3. 3.3. Bitfield types

3.1. Arithmetic types

The representations of the basic arithmetic types are target dependent, so, for example, an int may contain 16, 32, 64 or some other number of bits. Thus it is necessary to introduce a token to stand for each of the built-in arithmetic types (including the long long types). Each integral type is represented by a VARIETY token as follows:

TypeTokenEncoding
char~char0
signed char~signed_char0 | 4=4
unsigned char~unsigned_char0 | 8=8
signed short~signed_short1 | 4=5
unsigned short~unsigned_short1 | 8=9
signed int~signed_int2 | 4=6
unsigned int~unsigned_int2 | 8=10
signed long~signed_long3 | 4=7
unsigned long~unsigned_long3 | 8=11
signed long long~signed_longlong3 | 4 | 16=23
unsigned long long~unsigned_longlong3 | 8 | 16=27

Similarly each floating point type is represent by a FLOATING_VARIETY token:

TypeToken
float~float
double~double
long double~long_double

Each integral type also has an encoding as a SIGNED_NAT as shown above. This number is a bit pattern built up from the following values:

TypeEncoding
char0
short1
int2
long3
signed4
unsigned8
long long16

Any target dependent integral type can be represented by a SIGNED_NAT token using this encoding. This representation, rather than one based on VARIETYs, is used for ease of manipulation. The token:

~convert : ( SIGNED_NAT ) -> VARIETY

gives the mapping from the integral encoding to the representing variety. For example, it will map 6 to ~signed_int.

The token:

~promote : ( SIGNED_NAT ) -> SIGNED_NAT

describes how to form the promotion of an integral type according to the ISO C/C++ value preserving rules, and is used by the producer to represent target dependent promotion types. For example, the promotion of unsigned short may be int or unsigned int depending on the representation of these types; that is to say, ~promote ( 9 ) will be 6 on some machines and 10 on others. Although ~promote is used by default, a program may specify another token with the same sort signature to be used in its place by means of the directive:

#pragma TenDRA compute promote identifier

For example, a standard token ~sign_promote is defined which gives the older C sign preserving promotion rules. In addition, the promotion of an individual type can be specified using:

#pragma TenDRA promoted type-id : promoted-type-id

The token:

~arith_type : ( SIGNED_NAT, SIGNED_NAT ) -> SIGNED_NAT

similarly describes how to form the usual arithmetic result type from two promoted integral operand types. For example, the arithmetic type of long and unsigned int may be long or unsigned long depending on the representation of these types; that is to say, ~arith_type ( 7, 10 ) will be 7 on some machines and 11 on others.

Any tokenised type declared using:

#pragma token VARIETY v # tv

will be represented by a SIGNED_NAT token with external name tv corresponding to the encoding of v. Special cases of this are the implementation dependent integral types which arise naturally within the language. The external token names for these types are given below:

TypeToken
bool~cpp.bool
ptrdiff_tptrdiff_t
size_tsize_t
wchar_twchar_t

So, for example, a sizeof expression has shape ~convert ( size_t ). The token ~cpp.bool is defined in the default implementation, but the other tokens are defined according to their definitions on the target machine in the normal API library building mechanism.

3.2. Integer literal types

The type of an integer literal is defined in terms of the first in a list of possible integral types. The first type in which the literal value can be represented gives the type of the literal. For small literals it is possible to work out the type exactly, however for larger literals the result is target dependent. For example, the literal 50000 will have type int on machines in which 50000 fits into an int, and long otherwise. This target dependent mapping is given by a series of tokens of the form:

~lit_* : ( SIGNED_NAT ) -> SIGNED_NAT

which map a literal value to the representation of an integral type. The token used depends on the list of possible types, which in turn depends on the base used to represent the literal and the integer suffix used, as given in the following table:

BaseSuffixTokenTypes
decimal(none)~lit_intint, long, unsigned long
octal(none)~lit_hexint, unsigned int, long, unsigned long
hexadecimal(none)~lit_hexint, unsigned int, long, unsigned long
anyU~lit_unsignedunsigned int, unsigned long
anyL~lit_longlong, unsigned long
anyUL~lit_ulongunsigned long
anyLL~lit_longlonglong long, unsigned long long
anyULL~lit_ulonglongunsigned long long

Thus, for example, the shape of the integer literal 50000 is:

~convert ( ~lit_int ( 50000 ) )

3.3. Bitfield types

The sign of a plain bitfield type, declared without using signed or unsigned, is left unspecified in C and C++. The token:

~cpp.bitf_sign : ( SIGNED_NAT ) -> BOOL

is used to give a mapping from integral types to the sign of a plain bitfield of that type, in a form suitable for use in the TDF bfvar_bits construct. (Note that ~cpp.bitf_sign should have been a standard C token but was omitted.)