C/C++ Producer Configuration Guide

  1. i. Introduction
    1. ii. Interface descriptions
  2. 1. Configuring the Compiler
    1. 1.1. Configuration files
    2. 1.2. Low level configuration
    3. 1.3. Scoping options
  3. 2. Implementation limits
  4. 3. Configuration for lexical analysis
    1. 3.1. Lexical analysis
    2. 3.2. Keywords
    3. 3.3. Nested comments
    4. 3.4. Identifier names
    5. 3.5. Identifier name length
  5. 4. Configuration for the preprocessor
    1. 4.1. Preprocessing directives
    2. 4.2. File inclusion directives
    3. 4.3. Macro definitions
  6. 5. Configuration for types
    1. 5.1. The Portability Table
    2. 5.2. Specifying integer literal types
    3. 5.3. Extended integral types
    4. 5.4. Bitfield types
    5. 5.5. Type declarations
    6. 5.6. Type compatibility
    7. 5.7. Incomplete types
    8. 5.8. Built-in types
    9. 5.9. Sign of char
  7. 6. Configuration for literals
    1. 6.1. Integer literals
    2. 6.2. Character literals
    3. 6.3. Writeable String literals
    4. 6.4. Concatenation of character string literals and wide character string literals
    5. 6.5. Escape sequences
  8. 7. Configuration for declarations
    1. 7.1. Empty source files
    2. 7.2. Untagged compound types
    3. 7.3. Empty declarations
    4. 7.4. Unifying the tag name space
    5. 7.5. Extra commas
    6. 7.6. Implicit int
    7. 7.7. Implicit function declarations
    8. 7.8. Forward enumeration declarations
    9. 7.9. Variable scope in for statements
    10. 7.10. Anonymous unions
  9. 8. Configuration for initialisers
    1. 8.1. Initialisation of compound types
    2. 8.2. Variable initialisation
  10. 9. Configuration for expressions
    1. 9.1. Cast expressions
    2. 9.2. Initialiser expressions
    3. 9.3. Lvalue expressions
  11. 10. Configuration for functions
    1. 10.1. Ellipsis in function calls
    2. 10.2. Static block level functions
  12. 11. Configuration for linkage
    1. 11.1. Default linkage
    2. 11.2. Identifier linkage
    3. 11.3. Static identifiers
    4. 11.4. External volatility
    5. 11.5. Function linkage
    6. 11.6. Resolving linkage problems
  13. A. Standard library

First published .

Revision History

kate

Merged in Integral Type Specification and Dialect Features from C/C++ Checker Reference Manual, and moved out sections relevant to checking to C/C++ Checker Reference Manual.

Moved out documentation for the supplied portability tables from the C/C++ Producer Configuration Guide. Moved out portability table syntax to create a tdfc2portability manpage. Moved out compilation scheme for C++ spec file linking to C/C++ Checker Reference Manual.

kate

Restructured C/C++ Producer Configuration Guide.

kate

Some normalisation for makefile variables, now I'm done moving things around; this marks the start of clearing up the post-restructuring aftermath.

The various *_DIR variables have been replaced with PREFIX_* instead, and used more consistently. Since the layout under $PREFIX is now per-project, the machines directory is gone, and so this change removes variables associated with that.

Hopefully this should be a bit simpler for package maintainers to configure, by overriding whatever they wish.

kate

Move out the C++ LPI token implementations and the C++ (minimal) standard library to the producer project.

kate

Moved out the DRA producers as a standalone tool.

kate

Moved out the “invocation” chapter and related content to the tcpplus manpage.

Moved out the tdfc2dump symbol table dump syntax into a seperate manpage, tdfc2dump. Moved out a description of the symbol table semantics into a seperate document, The C/C++ Symbol Table Dump.

Moved out the The Pragma Token Syntax into a seperate document.

kate

Moved out the C/C++ Producer Implementation into a seperate document.

truedfx

Suppose three files are being used.

a.c:

#include "a.h"
int main(void) {}

a.h:

#include "b.h"

b.h:

extern int unused;

tendra outputs line directives based on the tokens that are present after preprocessing. Since no such tokens are found in a.h, there is no mention of a.h in the preprocessed output. This causes some configure scripts to fail. By printing the full #include stack every time the file changes, the problem is avoided. The printing is moved into a separate function because it needs to call itself recursively.

truedfx

Allow literal to prefix an expression to indicate that the expression is to be interpreted as a constant expression. This is useful for offsetof, which needs to returns a target-dependent constant expression. Without a way to force an expression as a constant, offsetof cannot be used in (for example) the definitions of enumeration constants.

literal does not become a keyword by default. It must be defined using

# pragma TenDRA keyword __literal for keyword literal

where __literal can be any valid identifier.

truedfx

Support anonymous unions as an extension in C. The code to handle these already exists in tendra, but is hidden in #if LANGUAGE_CPP blocks. Remove these #if blocks, add a new error so that use of this feature is still diagnosed in C by default. Add a pragma

# pragma TenDRA anonymous union ...

to allow this feature to be enabled or disabled during compilation, and default to an error for C, and no error for C++, to preserve the existing behaviour.

DERA

tcpplus 1.8.2; TenDRA 4.1.2 release.

i. Introduction

  1. ii. Interface descriptions

This document is designed as a technical overview to usage of the TenDRA C++ to TDF/ANDF producer. It also describes the public interfaces to the producer.

Whereas the interface description contains most of the information which would be required in a users' guide, it is not necessarily in a readily digestible form. The C++ producer is designed to complement the existing TenDRA C to TDF producer; although they are completely distinct programs, the same design philosophy underlies both and they share a number of common interfaces. There are no radical differences between the two producers, besides the fact that the C++ producer covers a vastly larger and more complex language. This means that much of the documentation on the C producer can be taken as also applying to the C++ producer. This document tries to make clear where the C++ producer extends the C producer's interfaces, and those portions of these interfaces which are not directly applicable to C++.

A familiarity with both C++ and TDF is assumed. The version of C++ implemented is that given by the draft ISO C++ standard. All references to "ISO C++" within the document should strictly be qualified using the word "draft", but for convenience this has been left implicit. The C++ producer has a number of switches which allow it to be configured for older dialects of C++. In particular, the version of C++ described in the ARM (Annotated Reference Manual) is fully supported.

The TDF Specification (version 4.0) may be consulted for a description of the compiler intermediate language used. The paper TDF and Portability provides a useful (if slightly old) introduction to some of the ideas relating to static program analysis and interface checking which underlie the whole TenDRA compilation system.

Since this document was originally written, the old C producer, tdfc, has been replaced by a new C producer, tdfc2, which is just a modified version of the C++ producer, tcpplus. All C producer documentation continues to apply to the new C producer, but the new C producer also has many of the features described in this document as only applying to the C++ producer.

ii. Interface descriptions

The most important public interfaces of the C++ producer are the ISO C++ standard and the TDF 4.0 specification; however there are other interfaces, mostly common to both the C and C++ producers, which are described in this section.

An important design criterion of the C++ producer was that it should be strictly ISO conformant by default, but have a method whereby dialect features and extra static program analysis can be enabled. This compiler configuration is controlled by the #pragma TenDRA directives described in the first section.

The requirement that the C and C++ producers should be able to translate portable C or C++ programs into target independent TDF requires a mechanism whereby the target dependent implementations of APIs can be represented. This mechanism, the #pragma token syntax, is described in The Pragma Token Syntax. Note that at present this mechanism only contains support for C APIs; it is considered that the C++ language itself contains sufficient interface mechanisms for C++ APIs to be described.

The C and C++ producers provide two mechanisms whereby type and declaration information derived from a translation unit can be stored to a file for post-processing by other tools. The first is the symbol table dump, which is a public interface designed for use by third party tools. The second is the C/C++ spec file, which is designed for ease of reading and writing by the producers themselves, and is used for intermodule analysis.

The mapping from C++ to TDF implemented by the C++ producer is largely straightforward. There are however target dependencies arising within the language itself which require special handling. These are represented by certain standard tokens which the producer requires to be defined on the target machine. These tokens are also used to describe the interface between the producer and the run-time system. Note that the C++ producer is primarily concerned with the C++ language, not with the standard C++ library. An example implementation of those library components which are required as an integral part of the language (memory allocation, exception handling, run-time type information etc.) is provided. Otherwise, libraries should be obtained from third parties. A number of hints on integrating such libraries with the C++ producer are given.

1. Configuring the Compiler

  1. 1.1. Configuration files
  2. 1.2. Low level configuration
  3. 1.3. Scoping options

This document describes the capabilities of the TenDRA C checker for enforcing the ISO C standard as well as features for detecting areas left undefined by the standard. It also lists the non-ISO dialect features supported by the checker in order to provide compatibility with older versions of C and allow the use of third-party source which may contain non-standard constructs.

This majority of this document describes how the C++ producer can be configured to apply extra static checks or to support various dialects of C++. In all cases the default behaviour is precisely that specified in the ISO C++ standard with no extra checks.

1.1. Configuration files

Certain basic type information is specified using a portability table, which may be specified to the producer using the -n option. The syntax for this file is documented by tdfc2portability.

Mappings to arbitary execution character sets may be specified using the -C option. The default is to the use the same character set as the host system. The syntax for this file is documented by tdfc2charset.

The tcc frontend is typically responsible for providing these files; see the TCC Users' Guide for details.

1.2. Low level configuration

The primary method of configuration is by means of #pragma directives. These directives may be placed within the program itself, however it is generally more convenient to group them into a start-up file in order to create a user-defined compilation profile (see the -X option for tcc). The #pragma directives recognised by the C++ producer have one of the equivalent forms:

#pragma TenDRA ....
#pragma TenDRA++ ....

Some of these are common to the C and C++ producers (although often with differing default behaviour). The C producer will ignore any TenDRA++ directives, so these may be used in compilation profiles which are to be used by both producers. In the descriptions below, the presence of a ++ is used to indicate a directive which is C++ specific; the other directives are common to both producers.

Within the description of the #pragma syntax, on stands for on, off or warning, allow stands for allow, disallow or warning, string-literal is any string literal, integer-literal is any integer literal, identifier is any simple, unqualified identifier name, and type-id is any type identifier. Other syntactic items are described in the text. A complete grammar for the #pragma directives accepted by the C++ producer is given in The Pragma Token Syntax.

The simplest level of configuration is to reset the severity level of a particular error message using:

#pragma TenDRA++ error string-literal on
#pragma TenDRA++ error string-literal allow

The given string-literal should name an error from the make_err error catalogue. A severity of on or disallow indicates that the associated diagnostic message should be an error, which causes the compilation to fail. A severity of warning indicates that the associated diagnostic message should be a warning, which is printed but allows the compilation to continue. A severity of off or allow indicates that the associated error should be ignored. Reducing the severity of any error from its default value, other than via one of the dialect directives described in this section, results in undefined behaviour.

The next level of configuration is to reset the severity level of a particular compiler option using:

#pragma TenDRA++ option string-literal on
#pragma TenDRA++ option string-literal allow

The given string-literal should name an option from the option catalogue. The simplest form of compiler option just sets the severity level of one or more error messages. Some of these options may require additional processing to be applied.

It is possible to link a particular error message to a particular compiler option using:

#pragma TenDRA++ error string-literal as option string-literal

Note that the directive:

#pragma TenDRA++ use error string-literal

can be used to raise a given error at any point in a translation unit in a similar fashion to the #error directive. The values of any parameters for this error are unspecified.

The directives just described give the primitive operations on error messages and compiler options. Many of the remaining directives in this section are merely higher level ways of expressing these primitives.

1.3. Scoping options

A new checking scope may be started by inserting the pragma:

#pragma TenDRA begin

at the outermost level. The scope runs until the matching:

#pragma TenDRA end

directive, or to the end of the translation unit (the ISO C standard definition of a translation unit as being a source file, together with any headers or source files included using the #include preprocessing directive, less any source lines skipped by any of the conditional inclusion preprocessing directives, is used throughout this document).

Checking scopes may be nested in the obvious way.

Each new checking scope inherits its initial set of checks from the checking scope which immediately contains it (this includes the implicit main checking scope consisting of the entire source file). Any checks switched on or off within the scope apply only to that scope and any scope it contains. The set of checks applied reverts to its previous state at the end of a scope. Thus, for example:

#pragma TenDRA variable analysis on
/* Variable analysis is on here */

#pragma TenDRA begin
#pragma TenDRA variable analysis off
	/* Variable analysis is off here */
#pragma TenDRA end

/* Variable analysis is on again here */

Once a check has been set any attempt to change its status within the same scope is flagged as an error. If checks need to be switched on and off in the same source file, they must be properly scoped. The built-in compilation modes have the entire source file as their scope.

The method of applying different checking profiles to different parts of a program clearly needs to take into account those properties of C which can circumvent such scoping. Consider for example:

#pragma TenDRA begin
#pragma TenDRA unknown escape allow
#define STRING "hello\!"
#pragma TenDRA end

char * f () {
	return ( STRING ) ;
}

The macro STRING is defined in an area where unknown escape sequences, such as \!, are allowed, but it is expanded in an area where they are not allowed (this is the default setting). The conventional approach to macro expansion would lead to the unknown escape sequence being flagged as an error, even though the user probably intended to avoid this. The checker therefore expands all macros using the checking profile in which they were defined, rather than the current checking scope.

The directives describing the user's desired checking profile could be included directly in the program itself, ideally in some configuration file which is #include'd in all source files. It is however perhaps more appropriate to store the directives as a startup file, file say, which is passed to the checker using the -ffilecommand line option. It should be noted that user-defined compilation modes are defined on top of a built-in mode base (normally Xc, the default mode). It is therefore important to scope the new checking profile as described above.

Names may be associated with checking scopes by using an alternative form of the begin directive:

#pragma TenDRA begin name environment identifier

where identifier is any valid C identifier. Thereafter a statement of the form:

#pragma TenDRA use environment identifier

changes the current checking environment to the environment associated with identifier.

Sometimes it may be desirable to use different checking profiles for different parts of a translation unit, e.g. applying less strict checks to any system headers which may be included. The checker can be configured to apply a named checking scope, env_name, to any files included from a directory which has been named dir_name, using:

#pragma TenDRA directory dir_name use environment env_name

The directory name must be passed to the checker using the -N dir_name : dir -I dir command line option. This is equivalent to the usual -Idir option for specifying include paths, except that it also attaches the name dir_name to the directory.

Most compiler options are scoped. A checking scope may be defined by enclosing a list of declarations within:

#pragma TenDRA begin
....
#pragma TenDRA end

If the final end directive is omitted then the scope ends at the end of the translation unit. Checking scopes may be nested in the obvious way. A checking scope inherits its initial set of checks from its enclosing scope (this includes the implicit main checking scope consisting of the entire input file). Any checks switched on or off within a scope apply only to the remainder of that scope and any scope it contains. A particular check can only be set once in a given scope. The set of applied checks reverts to its previous state at the end of the scope.

A checking scope can be named using the directives:

#pragma TenDRA begin name environment identifier
....
#pragma TenDRA end

Checking scope names occupy a namespace distinct from any other namespace within the translation unit. A named scope defines a set of modifications to the current checking scope. These modifications may be reapplied within a different scope using:

#pragma TenDRA use environment identifier

The default behaviour is not to allow checks set in the named checking scope to be reset in the current scope. This can however be modified using:

#pragma TenDRA use environment identifier reset allow

Another use of a named checking scope is to associate a checking scope with a named include file directory. This is done using:

#pragma TenDRA directory identifier use environment identifier

where the directory name is one introduced via a -N command-line option. The effect of this directive, if a #include directive is found to resolve to a file from the given directory, is as if the file was enclosed in directives of the form:

#pragma TenDRA begin
#pragma TenDRA use environment identifier reset allow
....
#pragma TenDRA end

The checks applied to the expansion of a macro definition are those from the scope in which the macro was defined, not that in which it was expanded. The macro arguments are checked in the scope in which they are specified, that is to say, the scope in which the macro is expanded. This enables macro definitions to remain localised with respect to checking scopes.

2. Implementation limits

This table gives the default implementation limits imposed by the C++ producer for the various implementation quantities listed in Annex B of the ISO C++ standard, together with the minimum limits allowed in ISO C and C++. A default limit of none means that the quantity is limited only by the size of the host machine (either ULONG_MAX or until it runs out of memory). A limit of target means that while no limits is imposed by the C++ front-end, particular target machines may impose such limits.

Quantity identifierMin C limitMin C++ limitDefault limit
statement_depth15256none
hash_if_depth8256none
declarator_max12256none
paren_depth32256none
name_limit311024none
extern_name_limit61024target
external_ids51165536target
block_ids1271024none
macro_ids102465536none
func_pars31256none
func_args31256none
macro_pars31256none
macro_args31256none
line_length50965536none
string_length50965536none
sizeof_object32767262144target
include_depth8256256
switch_cases25716384none
data_members12716384none
enum_consts1274096none
nested_class15256none
atexit_funcs3232target
base_classesN/A16384none
direct_basesN/A1024none
class_membersN/A4096none
virtual_funcsN/A16384none
virtual_basesN/A1024none
static_membersN/A1024none
friendsN/A4096none
access_declarationsN/A4096none
ctor_initializersN/A6144none
scope_qualifiersN/A256none
external_specsN/A1024none
template_parsN/A1024none
instance_depthN/A1717
exception_handlersN/A256none
exception_specsN/A256none

It is possible to impose lower limits on most of the quantities listed above by means of the directive:

#pragma TenDRA++ option value string-literal integer-literal

where string-literal gives one of the quantity identifiers listed above and integer-literal gives the limit to be imposed. An error is reported if the quantity exceeds this limit (note however that checks have not yet been implemented for all of the quantities listed). Note that the name_limit and include_depth implementation limits can be set using dedicated directives.

The maximum number of errors allowed before the producer bails out can be set using the directive:

#pragma TenDRA++ set error limit integer-literal

The default value is 32.

3. Configuration for lexical analysis

  1. 3.1. Lexical analysis
  2. 3.2. Keywords
  3. 3.3. Nested comments
  4. 3.4. Identifier names
  5. 3.5. Identifier name length

3.1. Lexical analysis

During lexical analysis, a source file which is not empty should end in a newline character. It is possible to relax this constraint using the directive:

#pragma TenDRA no nline after file end allow

3.2. Keywords

In several places in this section it is described how to introduce keywords for TenDRA language extensions. By default, no such extra keywords are defined. There are also low-level directives for defining and undefining keywords. The directive:

#pragma TenDRA++ keyword identifier for keyword identifier

can be used to introduce a keyword (the first identifier) standing for the standard C++ keyword given by the second identifier. The directive:

#pragma TenDRA++ keyword identifier for operator operator

can similarly be used to introduce a keyword giving an alternative representation for the given operator or punctuator, as, for example, in:

#pragma TenDRA++ keyword and for operator &&

Finally the directive:

#pragma TenDRA++ undef keyword identifier

can be used to undefine a keyword.

3.3. Nested comments

C-style comments do not nest. The directive:

#pragma TenDRA nested comment analysis on

enables a check for the characters /* within C-style comments.

The occurence of the /* characters inside a C comment, i.e. text surrounded by the /* and */ symbols, is usually a mistake and can lead to the termination of a comment unexpectedly. By default such nested comments are processed silently, however an error or warning can be produced by setting:

#pragma TenDRA nested comment analysis status

with status as on or warning. If status is off the default behaviour is restored.

3.4. Identifier names

During lexical analysis, each character in the source file has an associated look-up value which is used to determine whether the character can be used in an identifier name, is a white space character etc. These values are stored in a simple look-up table. It is possible to set the look-up value using:

#pragma TenDRA++ character character-literal as character-literal allow

which sets the look-up for the first character to be the default look-up for the second character. The form:

#pragma TenDRA++ character character-literal disallow

sets the look-up of the character to be that of an invalid character. The forms:

#pragma TenDRA++ character string-literal as character-literal allow
#pragma TenDRA++ character string-literal disallow

can be used to modify the look-up values for the set of characters given by the string literal. For example:

#pragma TenDRA character '$' as 'a' allow
#pragma TenDRA character '\r' as ' ' allow

allows $ to be used in identifier names (like a) and carriage return to be a white space character. The former is a common dialect feature and can also be controlled by the directive:

#pragma TenDRA dollar as ident allow

The ISO C standard (Section 6.1) states that the use of the character $ in identifier names is illegal. The pragma:

#pragma TenDRA dollar as ident allow

can be used to allow such identifiers, which by default are flagged as errors. There is also a disallow variant which restores the default behaviour.

3.5. Identifier name length

Under the ISO C standard rules on identifier name length, an implementation is only required to treat the first 31 characters of an internal name and the first 6 characters of an external name as significant. The TenDRA C checker provides a facility for users to specify the maximum number of characters allowed in an identifier name, to prevent unexpected results when the application is moved to a new implementation.

The maximum number of characters allowed in an identifier name can be set using the directives:

#pragma TenDRA set name limit integer-literal
#pragma TenDRA++ set name limit integer-literal warning

This length is given by the name_limit implementation quantity mentioned above. Identifiers which exceed this length raise an error or a warning, but are not truncated.

#pragma TenDRA set name limit integer_constant

There is currently no distinction made between external and internal names for length checking. Identifier name lengths are not checked in the default mode.

4. Configuration for the preprocessor

  1. 4.1. Preprocessing directives
  2. 4.2. File inclusion directives
  3. 4.3. Macro definitions

4.1. Preprocessing directives

Non-standard preprocessing directives can be controlled using the directives:

#pragma TenDRA directive ppdir allow
#pragma TenDRA directive ppdir (ignore) allow

where ppdir can be assert, file, ident, import (C++ only), include_next (C++ only), unassert, warning (C++ only) or weak. The second form causes the directive to be processed but ignored (note that there is no (ignore) disallow form). The treatment of other unknown preprocessing directives can be controlled using:

#pragma TenDRA unknown directive allow

Cases where the token following the # in a preprocessing directive is not an identifier can be controlled using:

#pragma TenDRA no directive/nline after ident allow

When permitted, unknown preprocessing directives are ignored.

By default, unknown #pragma directives are ignored without comment, however this behaviour can be modified using the directive:

#pragma TenDRA unknown pragma allow

Note that any unknown #pragma TenDRA directives always give an error.

Older preprocessors allowed text after #else and #endif directives. The following directive can be used to enable such behaviour:

#pragma TenDRA text after directive allow

Such text after a directive is ignored.

Some older preprocessors have problems with white space in preprocessing directives - whether at the start of the line, before the initial #, or between the # and the directive identifier. Such white space can be detected using the directives:

#pragma TenDRA indented # directive allow
#pragma TenDRA indented directive after # allow

respectively.

4.2. File inclusion directives

There is a maximum depth of nested #include directives allowed by the C++ producer. This depth is given by the include_depth implementation quantity mentioned above. Its value is fairly small in order to detect recursive inclusions. The maximum depth can be set using:

#pragma TenDRA includes depth integer-literal

A further check, for full pathnames in #include directives (which may not be portable), can be enabled using the directive:

#pragma TenDRA++ complete file includes allow

4.3. Macro definitions

By default, multiple consistent definitions of a macro are allowed. This behaviour can be controlled using the directive:

#pragma TenDRA extra macro definition allow

The ISO C/C++ rules for determining whether two macro definitions are consistent are fairly restrictive. A more relaxed rule allowing for consistent renaming of macro parameters can be enabled using:

#pragma TenDRA weak macro equality allow

In the definition of macros with parameters, a # in the replacement list must be followed by a parameter name, indicating the stringising operation. This behaviour can be controlled by the directive:

#pragma TenDRA no ident after # allow

which allows a # which is not followed by a parameter name to be treated as a normal preprocessing token.

In a list of macro arguments, the effect of a sequence of preprocessing tokens which otherwise resembles a preprocessing directive is undefined. The C++ producer treats such directives as normal sequences of preprocessing tokens, but can be made to report such behaviour using:

#pragma TenDRA directive as macro argument allow

5. Configuration for types

  1. 5.1. The Portability Table
  2. 5.2. Specifying integer literal types
  3. 5.3. Extended integral types
  4. 5.4. Bitfield types
  5. 5.5. Type declarations
  6. 5.6. Type compatibility
  7. 5.7. Incomplete types
  8. 5.8. Built-in types
  9. 5.9. Sign of char

5.1. The Portability Table

The portability table is used by the checker to describe the minimum assumptions about the representation of the integral types. It contains information on the minimum integer sizes and the minimum range of values that can be represented by each integer type, the sign of plain char, and whether signed types can be assumed to be symmetric (for example, [-127,127]) or maximum (for example, [-128,127]). The format for this file is documented by tdfc2portability.

The minimum integer ranges are deduced from the minimum integer sizes as follows. Suppose b is the minimum number of bits that will be used to represent a certain integral type, then:

  • For unsigned integer types the minimum range is [0, 2b-1];

  • For signed integer types if signed_range is maximum the minimum range is [-2b-1, 2b-1-1]. Otherwise, if signed_range is symmetric the minimum range is [-(2b-1-1), 2b-1-1];

  • For the type char which is not specified as signed or unsigned, if char_type is signed then char is treated as signed, if char_type is unsigned then char is treated as unsigned, and if char_type is either, the minimum range of char is the intersection of the minimum ranges of signed char and unsigned char.

5.2. Specifying integer literal types

By default tdfc2 assumes that all integer ranges conform to the minimum ranges prescribed by the ISO C standard, i.e. char contains at least 8 bits, short and int contain at least 16 bits and long contains at least 32 bits. If the -Y32bit flag is passed to the checker it assumes that integers conform to the minimum ranges commonly found on most 32 bit machines, i.e. int contains at least 32 bits and int is strictly larger than short so that the integral promotion of unsigned short is int under the ISO C standard integer promotion rules.

The integer literal pragmas are used to define the method of computing the type of an integer literal. Integer literals cannot be used in a program unless the class to which they belong has been described using an integer literal pragma. Each built-in checking mode includes some integer literal pragmas describing the semantics appropriate for that mode. If these built-in modes are inappropriate, then the user must describe the semantics using the pragma below:

#pragma integer literal literal_class lit_class_type_list

The literal_class identifies the type of literal integer involved. The possibilities are:

  • decimal

  • octal

  • hexadecimal

Each of these types can optionally be followed by unsigned and/or long to specify an unsigned and/or long type respectively.

The values of the integer literals of any particular class are divided into contiguous sub-ranges specified by the lit_class_type_list which takes the form below:

lit_class_type_list
	*int_type_spec
		integer_constant int_type_spec | lit_class_type_listint_type_spec :
		: type_name
		* warning? : identifier
		** :

The first integer constant, i1 say, identifies the range [0, i1], the second, i2 say, identifies the range [i1 + 1, i2]. The symbol * specifies the unlimited range upwards from the last integer constant. Each integer constant must be strictly greater than its predecessor.

Associated with each sub-range is an int_type_spec which is either a type, a procedure token identifier with an optional warning (see G.9) or a failure. For each sub-range:

  • If the int_type_spec is a type name, then it must be an integral type and specifies the type associated with literals in that sub-range.

  • If the int_type_spec is an identifier, then the type of integer is computed by a procedure token of that name which takes the integer value as a parameter and delivers its type. The procedure token must have been declared previously as

    #pragma token PROC ( VARIETY ) VARIETY

    Since the type of the integer is computed by a procedure token which may be implemented differently on different targets, there is the option of producing a warning whenever the token is applied.

  • If the int_type_spec is **, then any integer literal lying in the associated sub-range will cause the checker to raise an error.

For example:

#pragma integer literal decimal 0x7fff : int | 0x7fffffff : long | * : unsigned long

divides unsuffixed decimal literals into three ranges: literals in the range [0, 0x7fff] are of type int, integer literals in the range [0x7fff, 0x7fffffff] are of type long and the remainder are of type unsigned long.

There are four pre-defined procedure tokens supplied with the compiler which are used in the startup files to provide the default specification for integer literals:

  • ~lit_int is the external identification of a token that returns the integer type according to the rules of ISO C for an unsuffixed decimal;

  • ~lit_hex is the external identification of a token that returns the integer type according to the rules of ISO C for an unsuffixed hexadecimal;

  • ~lit_unsigned is the external identification of a token that returns the integer type according to the rules of ISO C for integers suffixed by U only;

  • ~lit_long is the external identification of a token that returns the integer type according to the rules of ISO C for integers suffixed by L only.

5.3. Extended integral types

The long long integral types are not part of ISO C or C++ by default, however support for them can be enabled using the directive:

#pragma TenDRA longlong type allow

This support includes allowing long long in type specifiers and allowing LL and ll as integer literal suffixes.

There is a further directive given by the two cases:

#pragma TenDRA set longlong type : long long
#pragma TenDRA set longlong type : long

which can be used to control the implementation of the long long types. Either they can be mapped to the default representation, which is guaranteed to contain at least 64 bits, or they can be mapped to the corresponding long types.

Because these long long types are not an intrinsic part of C++ the implementation does not integrate them into the language as fully as is possible. This is to prevent the presence or otherwise of long long types affecting the semantics of code which does not use them. For example, it would be possible to extend the rules for the types of integer literals, integer promotion types and arithmetic types to say that if the given value does not fit into the standard integral types then the extended types are tried. This has not been done, although these rules could be implemented by changing the definitions of the standard tokens used to determine these types. By default, only the rules for arithmetic types involving a long long operand and for LL integer literals mention long long types.

5.4. Bitfield types

The C++ rules on bitfield types differ slightly from the C rules. Firstly any integral or enumeration type is allowed in a bitfield, and secondly the bitfield width may exceed the underlying type size (the extra bits being treated as padding). These properties can be controlled using the directives:

#pragma TenDRA extra bitfield int type allow
#pragma TenDRA bitfield overflow allow

respectively.

The ISO C standard only allows signed int, unsigned int and their equivalent types as type specifiers in bitfields. Using the default checking profile, tdfc2 raises errors for other integral types used as type specifiers in bitfields. This behaviour may be modified using the pragma:

#pragma TenDRA extra int bitfield type permit

where permit is one of allow (no errors raised), warning (allow non-int bitfields through with a warning) or disallow (raise errors for non-int bitfields).

If non-int bitfields are allowed, the bitfield is treated as if it had been declared with an int type of the same signedness as the given type. The use of the type char as a bitfield type still generally causes an error, since whether a plain char is treated as signed or unsigned is implementation-dependent. The pragma:

#pragma TenDRA character set-sign

where set-sign is signed, unsigned or either, can be used to specify the signedness of a plain char bitfield. If set-sign is signed or unsigned, the bitfield is treated as though it were declared signed char or unsigned char respectively. If set-sign is either, the sign of the bitfield is target-dependent and the use of a plain char bitfield causes an error.

5.5. Type declarations

C does not allow multiple definitions of a typedef name, whereas C++ allows multiple consistent definitions. This behaviour can be controlled using the directive:

#pragma TenDRA extra type definition allow

In accordence with the ISO C standard, in default mode tdfc2 does not allow a type to be defined more than once using a typedef. The pragma:

#pragma TenDRA extra type definition permit

where permit is allow (silently accepts redefinitions, provided they are consistent), warning or disallow.

5.6. Type compatibility

The directive:

#pragma TenDRA incompatible type qualifier allow

allows objects to be redeclared with different cv-qualifiers (normally such redeclarations would be incompatible). The composite type is qualified using the join of the cv-qualifiers in the various redeclarations.

The directive:

#pragma TenDRA compatible type : type-id == type-id : allow

asserts that the given two types are compatible. Currently the only implemented version is char * == void * which enables char * to be used as a generic pointer as it was in older dialects of C.

5.7. Incomplete types

Some dialects of C allow incomplete arrays as member types. These are generally used as a place-holder at the end of a structure to allow for the allocation of an arbitrarily sized array. Support for this feature can be enabled using the directive:

#pragma TenDRA incomplete type as object type allow

The ISO C standard (Section 6.1.2.5) states that an incomplete type e.g an undefined structure or union type, is not an object type and that array elements must be of object type. The default behaviour of the checker causes errors when incomplete types are used to specify array element types. The pragma:

#pragma TenDRA incomplete type as object type permit

can be used to alter the treatment of array declarations with incomplete element types. permit is one of allow, disallow or warning as usual.

5.8. Built-in types

The definitions of implementation dependent integral types which arise naturally within the language - the type of the difference of two pointers, ptrdiff_t, and the type of the sizeof operator, size_t - given in the <stddef.h> header can be overridden using the directives:

#pragma TenDRA set ptrdiff_t : type-id
#pragma TenDRA set size_t : type-id

These directives are useful when targeting a specific machine on which the definitions of these types are known; while they may not affect the code generated they can cut down on spurious conversion warnings. Note that although these types are built into the producer they are not visible to the user unless an appropriate header is included (with the exception of the keyword wchar_t in ISO C++), however the directives:

#pragma TenDRA++ type identifier for type-name

can be used to make these types visible. They are equivalent to a typedef declaration of identifier as the given built-in type, ptrdiff_t, size_t or wchar_t.

5.9. Sign of char

Whether plain char is signed or unsigned is implementation dependent. By default the implementation is determined by the definition of the ~char token, however this can be overridden in the producer either by means of the portability table or by the directive:

#pragma TenDRA character character-sign

where character-sign can be signed, unsigned or either (the default). Again this directive is useful primarily when targeting a specific machine on which the signedness of char is known.

6. Configuration for literals

  1. 6.1. Integer literals
  2. 6.2. Character literals
  3. 6.3. Writeable String literals
  4. 6.4. Concatenation of character string literals and wide character string literals
  5. 6.5. Escape sequences

6.1. Integer literals

The rules for finding the type of an integer literal can be described using directives of the form:

#pragma TenDRA integer literal literal-spec

where:

literal-spec :
	literal-base literal-suffix? literal-type-list

literal-base :
	octal
	decimal
	hexadecimal

literal-suffix :
	unsigned
	long
	unsigned long
	long long
	unsigned long long

literal-type-list :
	* literal-type-spec
	integer-literal literal-type-spec | literal-type-list
	? literal-type-spec | literal-type-list

literal-type-spec :
	: type-id
	* allow? : identifier
	* * allow? :

Each directive gives a literal base and suffix, describing the form of an integer literal, and a list of possible types for literals of this form. This list gives a mapping from the value of the literal to the type to be used to represent the literal. There are three cases for the literal type; it may be a given integral type, it may be calculated using a given literal type token (see C/C++ Producer Implementation), or it may cause an error to be raised. There are also three cases for describing a literal range; it may be given by values less than or equal to a given integer literal, it may be given by values which are guaranteed to fit into a given integral type, or it may be match any value. For example:

#pragma token PROC ( VARIETY c ) VARIETY l_i # ~lit_int
#pragma TenDRA integer literal decimal 32767 : int | ** : l_i

describes how to find the type of a decimal literal with no suffix. Values less that or equal to 32767 have type int; larger values have target dependent type calculated using the token ~lit_int. Introducing a warning into the directive will cause a warning to be printed if the token is used to calculate the value.

Note that this scheme extends that implemented by the C producer, because of the need for more accurate information in the C++ producer. For example, the specification above does not fully express the ISO rule that the type of a decimal integer is the first of the types int, long and unsigned long which it fits into (it only expresses the first step). However with the C++ extensions it is possible to write:

#pragma token PROC ( VARIETY c ) VARIETY l_i # ~lit_int
#pragma TenDRA integer literal decimal ? : int | ? : long |\
		? : unsigned long | ** : l_i

6.2. Character literals

By default, a simple character literal has type int in C and type char in C++. The type of such literals can be controlled using the directive:

#pragma TenDRA++ set character literal : type-id

The type of a wide character literal is given by the implementation defined type wchar_t. By default, the definition of this type is taken from the target machine's <stddef.h> C header (note that in ISO C++, wchar_t is actually a keyword, but its underlying representation must be the same as in C). This definition can be overridden in the producer by means of the directive:

#pragma TenDRA set wchar_t : type-id

for an integral type type-id.

6.3. Writeable String literals

By default, character string literals have type char [n] in C and older dialects of C++, but type const char [n] in ISO C++. Similarly wide string literals have type wchar_t [n] or const wchar_t [n]. Whether string literals are const or not can be controlled using the two directives:

#pragma TenDRA++ set string literal : const
#pragma TenDRA++ set string literal : no const

In the case where literals are const, the array-to-pointer conversion is allowed to cast away the const to allow for a degree of backwards compatibility. The status of this deprecated conversion can be controlled using the directive:

#pragma TenDRA writeable string literal allow

(yes, I know that that should be writable). Note that this directive has a slightly different meaning in the C producer.

The ISO C standard, section 6.1.4, states that if the program attempts to modify a string literal of either form, the behaviour is undefined. Assignments to string literals of the form:

"abc" = '3';

always result in errors. Other attempts to modify members of string literals, e.g.

"abc"[1] = '3';

are permitted in the default checking mode. This behaviour can be changed using:

#pragma TenDRA writeable string literal permit

where permit may be allow, warning or disallow.

6.4. Concatenation of character string literals and wide character string literals

Adjacent string literals tokens of similar types (either both character string literals or both wide string literals) are concatenated at an early stage in parser, however it is unspecified what happens if a character string literal token is adjacent to a wide string literal token. By default this gives an error, but the directive:

#pragma TenDRA unify incompatible string literal allow

can be used to enable the strings to be concatenated to give a wide string literal.

If a ' or " character does not have a matching closing quote on the same line then it is undefined whether an implementation should report an unterminated string or treat the quote as a single unknown character. By default, the C++ producer treats this as an unterminated string, but this behaviour can be controlled using the directive:

#pragma TenDRA unmatched quote allow

The ISO C standard, section 6.1.4, states that if a character string literal is adjacent to a wide character string literal, the behaviour is undefined. By default, this is flagged as an error by the checker. If the pragma:

#pragma TenDRA unify incompatible string literal permit

is used, with permit set to allow or warning the character string literal is converted to a wide character string literal and the strings are concatenated, although in the warning case a warning is output. The disallow version of the pragma restores the default behaviour.

6.5. Escape sequences

By default, if the character following the \ in an escape sequence is not one of those listed in the ISO C or C++ standards then an error is given. This behaviour, which is left unspecified by the standards, can be controlled by the directive:

#pragma TenDRA unknown escape allow

The result is that the \ in unknown escape sequences is ignored, so that \z is interpreted as z, for example. Individual escape sequences can be enabled or disabled using the directives:

#pragma TenDRA++ escape character-literal as character-literal allow
#pragma TenDRA++ escape character-literal disallow

so that, for example:

#pragma TenDRA++ escape 'e' as '\033' allow
#pragma TenDRA++ escape 'a' disallow

sets \e to be the ASCII escape character and disables the alert character \a.

By default, if the value of a character, given for example by a \x escape sequence, does not fit into its type then an error is given. This implementation dependent behaviour can however be controlled by the directive:

#pragma TenDRA character escape overflow allow

the value being converted to its type in the normal way.

The ISO C standard specifies a small set of escape sequences in strings, for example \n as newline. Unknown escape sequences lead to an error in the default mode , however the severity of the error may be altered using:

#pragma TenDRA unknown escape permit

where permit is allow (silently replaces the unknown escape sequence, \z say, by z), warning or disallow.

7. Configuration for declarations

  1. 7.1. Empty source files
  2. 7.2. Untagged compound types
  3. 7.3. Empty declarations
  4. 7.4. Unifying the tag name space
  5. 7.5. Extra commas
  6. 7.6. Implicit int
  7. 7.7. Implicit function declarations
  8. 7.8. Forward enumeration declarations
  9. 7.9. Variable scope in for statements
  10. 7.10. Anonymous unions

7.1. Empty source files

ISO C requires that a translation unit should contain at least one declaration. C++ and older dialects of C allow translation units which contain no declarations. This behaviour can be controlled using the directive:

#pragma TenDRA no external declaration allow

The ISO standard states that each source file should contain at least one declaration or definition. Source files which contain no external declarations or definitions are flagged as errors by the checker in default mode. The severity of the error may be altered using:

#pragma TenDRA no external declaration permit

where the options for permit are allow (no errors raised), warning or disallow.

7.2. Untagged compound types

ISO C++ requires every declaration or member declaration to introduce one or more names into the program. The directive:

#pragma TenDRA unknown struct/union allow

can be used to relax one particular instance of this rule, by allowing anonymous class definitions (recall that anonymous unions are objects, not types, in C++ and so are not covered by this rule).

The ISO C standard states that a declaration must declare at least a declarator, a tag or the members of an enumeration. The checker detects such declarations and, by default, raises an error. The severity of the errors can be altered by:

#pragma TenDRA unknown struct/union permit

where permit may be allow to allow code such as:

struct {
	int i;
	int j;
};

through without errors (statements such as this occur in some system headers) or disallow to restore the default behaviour.

7.3. Empty declarations

The C++ grammar also allows a solitary semicolon as a declaration or member declaration; however such a declaration does not introduce a name and so contravenes the rule above. The rule can be relaxed in this case using the directive:

#pragma TenDRA extra ; allow

Note that the C++ grammar explicitly allows for an extra semicolon following an inline member function definition, but that semicolons following other function definitions are actually empty declarations of the form above. A solitary semicolon in a statement is interpreted as an empty expression statement rather than an empty declaration statement.

Some dialects of C allow extra semicolons at the external declaration and definition level in contravention of the ISO C standard. For example, the program:

int f ()
{
    return ( 0 );
};

is not ISO compliant. The checker enforces the ISO rules by default, but the errors raised may be reduced to warning or suppressed entirely using:

#pragma TenDRA extra ; permit

with permit as warning or allow. The disallow options restores the default behaviour.

7.4. Unifying the tag name space

Each object in the tag name space is associated with a classification (struct, union or enum) of the type to which it refers. If such a tag is used, it must be preceded by the correct classification, otherwise the checker produces an error by default. However, the pragma:

#pragma TenDRA ignore struct/union/enum tag status

may be used to change the severity of the error. The options for status are: on (allows a tag to be used with any of the three classifications, the correct classification being inferred from the type definition), warning or off.

7.5. Extra commas

The ISO C standard does not allow extra commas in enumeration type declarations e.g.

enum e = { red , orange , yellow , };

The extra comma at the end of the declaration is flagged as an error by default, but this behaviour may be changed by using:

#pragma TenDRA extra , permit

where permit has the usual allow, disallow and warning options.

7.6. Implicit int

The C "implicit int" rule, whereby a type of int is inferred in a list of type or declaration specifiers which does not contain a type name, has been removed in ISO C++, although it was supported in older dialects of C++. This check is controlled by the directive:

#pragma TenDRA++ implicit int type allow

Partial relaxations of this rules are allowed. The directive:

#pragma TenDRA++ implicit int type for const/volatile allow

will allow for implicit int when the list of type specifiers contains a cv-qualifier. Similarly the directive:

#pragma TenDRA implicit int type for function return allow

will allow for implicit int in the return type of a function definition (this excludes constructors, destructors and conversion functions, where special rules apply). A function definition is the only kind of declaration in ISO C where a declaration specifier is not required. Older dialects of C allowed declaration specifiers to be omitted in other cases. Support for this behaviour can be enabled using:

#pragma TenDRA implicit int type for external declaration allow

The four cases can be demonstrated in the following example:

extern a ;	// implicit int
const b = 1 ;	// implicit const int

f ()	// implicit function return
{
	return 2 ;
}

c = 3 ;	// error: not allowed in C++

Older C dialects allow external variables to be specified without a type, the type int being inferred. Thus, for example:

a, b;

is equivalent to:

int a, b;

By default these inferred declarations are not permitted, though tdfc2's behaviour can be modified using:

#pragma TenDRA implicit int type for external declaration permit

where permit is allow, warning or disallow.

A more common feature, allowed by the ISO C standard, but considered bad style by some, is the inference of an int return type for functions defined in the form:

f ( int n )
{
    ....
}

the checker's treatment of such functions can be determined using:

#pragma TenDRA implicit int type for function return permit

where permit can be allow, warning or disallow.

7.7. Implicit function declarations

C, but not C++, allows calls to undeclared functions, the function being declared implicitly. It is possible to enable support for implicit function declarations using the directive:

#pragma TenDRA implicit function declaration on

Such implicitly declared functions have C linkage and type int ( ... ).

7.8. Forward enumeration declarations

The ISO C Standard (Section 6.5.2.3) states that the first introduction of an enumeration tag shall declare the constants associated with that tag. This rule is enforced by the checker in default mode, however it can be relaxed using the pragma:

#pragma TenDRA forward enum declaration permit

where replacing permit by allow permits the declaration and use of an enumeration tag before the declaration of its associated enumeration constants. A disallow variant which restores the default behaviour is also available.

7.9. Variable scope in for statements

In ISO C++ the scope of a variable declared in a for-init-statement is the body of the for statement; in older dialects it extended to the end of the enclosing block. So:

for ( int i = 0 ; i < 10 ; i++ ) {
	// for statement body
}
return i ;	// OK in older dialects, error in ISO C++

This behaviour is controlled by the directive:

#pragma TenDRA++ for initialization block on

a state of on corresponding to the ISO rules and off to the older rules. Perhaps most useful is the warning state which implements the old rules but gives a warning if a variable declared in a for-init-statement is used outside the corresponding for statement body. A program which does not give such warnings should compile correctly under either set of rules.

7.10. Anonymous unions

A union declared without introducing a tag or identifier is termed an anonymous union. Members populate the scope where the union itself is declared. For example, this may be a surrounding struct, or a block:

union {
	int a;
	int b;
};
a = 5;

The ISO C Standard (Section 6.5.2.1) states that a union declaration must contain a tag or identifier. Several compilers permit this as an extension to C, and the later C11 standard formalises this as a required feature. Permissibility may be controlled using the pragma:

#pragma TenDRA anonymous union permit

where replacing permit by allow permits the declaration of an anonymous union, and warning will allow the declaration but produce a warning. A disallow variant which restores the default behaviour is also available.

By default anonymous unions are dissallowed for C.

Anonymous unions are required to be supported by C++, and setting this pragma has no effect. For C++, an anonymous union cannot have private or protected members or member functions (in addition, no union can have static data members).

8. Configuration for initialisers

  1. 8.1. Initialisation of compound types
  2. 8.2. Variable initialisation

8.1. Initialisation of compound types

Many older C dialects do not allow the initialisation of automatic variables of compound type. Thus, for example:

void f ()
{
	struct {
		int a;
		int b;
	} x = { 3, 2 };
}

would not be allowed by some older compilers, although by default tdfc2 does not raise any errors since the code is legal according to the ISO C standard. The checker's behaviour may be changed using:

#pragma TenDRA initialization of struct/union (auto) permit

where permit is allow, warning or disallow. This feature is particularly useful when developing a program which is intended to be compiled with a compiler which does not support automatic compound initialisations.

8.2. Variable initialisation

The ISO C standard (Section 6.5.7) states that all expressions in an initialiser for an object that has static storage duration or in an initialiser-list for an object that has aggregate or union type shall be constant expressions. The pragma:

#pragma TenDRA variable initialization permit

may be used to allow non-constant initialisers if permit is replaced by allow. The other option for permit is disallow which restores the default behaviour of flagging non-constant initialisers for objects of static storage duration as errors.

9. Configuration for expressions

  1. 9.1. Cast expressions
  2. 9.2. Initialiser expressions
  3. 9.3. Lvalue expressions

9.1. Cast expressions

ISO C++ introduces the constructs static_cast, const_cast and reinterpret_cast, which can be used in various contexts where an old style explicit cast would previously have been used. By default, an explicit cast can perform any combination of the conversions performed by these three constructs. To aid migration to the new style casts the directives:

#pragma TenDRA++ explicit cast as cast-state allow
#pragma TenDRA++ explicit cast allow

where cast-state is defined as follows:

cast-state :
			static_cast
			const_cast
			reinterpret_cast
			static_cast | cast-state
			const_cast | cast-state
			reinterpret_cast | cast-state

can be used to restrict the conversions which can be performed using explicit casts. The first form sets the interpretation of explicit cast to be combinations of the given constructs; the second resets the interpretation to the default. For example:

#pragma TenDRA++ explicit cast as static_cast | const_cast allow

means that conversions requiring reinterpret_cast (the most unportable conversions) will not be allowed to be performed using explicit casts, but will have to be given as a reinterpret_cast construct. Changing allow to warning will also cause a warning to be issued for every explicit cast expression.

9.2. Initialiser expressions

C, but not C++, only allows constant expressions in static initialisers. The directive:

#pragma TenDRA variable initialization allow

can be enable support for C++-style dynamic initialisers. Conversely, it can be used in C++ to detect such dynamic initialisers.

In older dialects of C it was not possible to initialise an automatic variable of structure or union type. This can be checked for using the directive:

#pragma TenDRA initialization of struct/union (auto) allow

The directive:

#pragma TenDRA++ complete initialization analysis on

can be used to check aggregate initialisers. The initialiser should be fully bracketed (i.e. with no elision of braces), and should have an entry for each member of the structure or array.

9.3. Lvalue expressions

C++ defines the results of several operations to be lvalues, whereas they are rvalues in C. The directive:

#pragma TenDRA conditional lvalue allow

is used to apply the C++ rules for lvalues in conditional (?:) expressions.

Older dialects of C++ allowed this to be treated as an lvalue. It is possible to enable support for this dialect feature using the directive:

#pragma TenDRA++ this lvalue allow

however it is recommended that programs using this feature should be modified.

The ? operator cannot normally be used to define an lvalue, so that for example, the program:

struct s {
    int a, b;
};

void f ( int n, struct s *s1, struct s *s2 )
{
    ( n ? s1 : s2) -> a = 0;
}

is not allowed in ISO C. The pragma:

#pragma TenDRA conditional lvalue allow

allows conditional lvalues if:

  • Both options of the conditional operator have compatible compound types;

  • Both options of the conditional are lvalues.

(there is also a disallow variant, but warning is not permitted in this case).

10. Configuration for functions

  1. 10.1. Ellipsis in function calls
  2. 10.2. Static block level functions

10.1. Ellipsis in function calls

The directive:

#pragma TenDRA ident ... allow

may be used to enable or disable the use of ... as a primary expression in a function defined with ellipsis. The type of such an expression is implementation defined. This expression is used in the definition of the va_start macro in the <stdarg.h> header. This header automatically enables this switch.

An ellipsis is not an identifier and should not be used in a function call, even if, as in the program below, the function prototype contains an ellipsis:

int f (int a, ...)
{
    return 1;
}

int main()
{
    int x, y;
    x = f(y, ...);
    return 1;
}

In default mode the checker raises an error if an ellipsis is used as a parameter in a function call. The severity of this error can be modified by using:

#pragma TenDRA ident ... permit

If permit is replaced by allow the ellipsis is ignored, if warning is used tdfc2 produces a warning and if disallow is used the default behaviour is restored.

10.2. Static block level functions

The ISO C standard (Section 6.5.1) states that the declaration of an identifier for a function that has block scope shall have no explicit storage-class specifier other than extern. By default, tdfc2 raises an error for declarations which do not conform to this rule. The behaviour can be modified using:

#pragma TenDRA block function static permit

where permit is allow (accept block scope function declarations with other storage-class specifiers), disallow or warning.

11. Configuration for linkage

  1. 11.1. Default linkage
  2. 11.2. Identifier linkage
  3. 11.3. Static identifiers
  4. 11.4. External volatility
  5. 11.5. Function linkage
  6. 11.6. Resolving linkage problems

11.1. Default linkage

It is possible to set the default language linkage using the directive:

#pragma TenDRA++ external linkage string-literal

This is equivalent to enclosing the rest of the current checking scope in:

extern string-literal {
	....
}

It is unspecified what happens if such a directive is used within an explicit linkage specification and does not nest correctly. This directive is particularly useful when used in a named environment associated with an include directory. For example, it can be used to express the fact that all the objects declared in headers included from that directory have C linkage.

11.2. Identifier linkage

The ISO C standard, section 6.1.2.2, states that if, within a translation unit, an identifier appears with both internal and external linkage, the behaviour is undefined. By default, the checker silently declares the variable with external linkage. The check to detect variables which are redeclared with incompatible linkage is controlled using:

#pragma TenDRA incompatible linkage permit

where permit may be allow (default mode), warning (warn about incompatible linkage) or disallow (raise errors for redeclarations with incompatible linkage).

If an object is declared with both external and internal linkage in the same translation unit then, by default, an error is given. This behaviour can be changed using the directive:

#pragma TenDRA incompatible linkage allow

When incompatible linkages are allowed, whether the resultant identifier has external or internal linkage can be set using one of the directives:

#pragma TenDRA linkage resolution : off
#pragma TenDRA linkage resolution : (external) on
#pragma TenDRA linkage resolution : (internal) on

It is possible to declare objects with external linkage in a block. C leaves it undefined whether declarations of the same object in different blocks, such as:

void f ()
{
	extern int a ;
	....
}

void g ()
{
	extern double a ;
	....
}

are checked for compatibility. However in C++ the one definition rule implies that such declarations are indeed checked for compatibility. The status of this check can be set using the directive:

#pragma TenDRA unify external linkage on

11.3. Static identifiers

By default, objects and functions with internal linkage are mapped to tags without external names in the output TDF capsule. Thus such names are not available to the installer and it needs to make up internal names to represent such objects in its output. This is not desirable in such operations as profiling, where a meaningful internal name is needed to make sense of the output. The directive:

#pragma TenDRA preserve identifier-list

can be used to preserve the names of the given list of identifiers with internal linkage. This is done using the static_name_def TDF construct. The form:

#pragma TenDRA preserve *

will preserve the names of all identifiers with internal linkage in this way.

11.4. External volatility

Older dialects of C treated all identifiers with external linkage as if they had been declared volatile (i.e. by being conservative in optimising such values). This behaviour can be enabled using the directive:

#pragma TenDRA external volatile_t
#pragma TenDRA external volatile_t

instructs the checker thereafter to treat any object declared with external linkage (ISO C standard Section 6.1.2.2) as if it were volatile (ISO C standard Section 6.5.3). This was a feature of some traditional C dialects. In the default mode, objects with external linkage are only treated as volatile if they were declared with the volatile type qualifier.

11.5. Function linkage

A change in ISO C++ relative to older dialects is that the language linkage of a function now forms part of the function type. For example:

extern "C" int f ( int ) ;
int ( *pf ) ( int ) = f ;		// error

The directive:

#pragma TenDRA++ external function linkage on

can be used to control whether function types with differing language linkages, but which are otherwise compatible, are considered compatible or not.

Note that it is not possible in ISO C or C++ to declare objects or functions with internal linkage in a block. While static object definitions in a block have a specific meaning, there is no real reason why static functions should not be declared in a block. This behaviour can be enabled using the directive:

#pragma TenDRA block function static allow

Inline functions have external linkage by default in ISO C++, but internal linkage in older dialects. The default linkage can be set using the directive:

#pragma TenDRA++ inline linkage linkage-spec

where linkage-spec can be external or internal. Similarly const objects have internal linkage by default in C++, but external linkage in C. The default linkage can be set using the directive:

#pragma TenDRA++ const linkage linkage-spec

11.6. Resolving linkage problems

Often the way that identifier names are resolved can alter the semantics of a program. For example, in:

void f () {
	{
		extern void g ();
		g ( 3 );
	}
	g ( 7 );
}

the external declaration of g is only in scope in the inner block of f. Thus, at the second call of g, it is not in scope, and so is inferred to have declaration:

extern int g ();

(see 3.4). This conflicts with the previous declaration of g which, although not in scope, has been registered in the external namespace. The pragma:

#pragma TenDRA unify external linkage on

modifies the algorithm for resolving external linkage by searching the external namespace before inferring a declaration. In the example above, this results in the second use of g being resolved to the previous external declaration. The on can be replaced by warning to give a warning when such resolutions are detected, or off to switch this feature off.

Another linkage problem, which is left undefined in the ISO C standard, is illustrated by the following program:

int f ()
{
	extern int g ();
	return ( g () );
}

static int g ()
{
	return ( 0 );
}

Is the external variable g (the declaration of which would be inferred if it was omitted) the same as the static variable g? Of course, had the order of the two functions been reversed, there would be no doubt that they were, however, in the given case it is undefined. By default, the linkage is resolved externally, so that the two uses of g are not identified. However, the checker can be made to resolve its linkage internally, so that the two uses of g are identified. The resolution algorithm can be set using:

#pragma TenDRA linkage resolution : action

where action can be one of:

  • (internal) on

  • (internal) warning

  • (external) on

  • (external) warning

  • off

depending on whether the linkage resolution is internal, external, or default, and whether a warning message is required. The most useful behaviour is to issue a warning for all such occurrences (by setting action to (internal) warning, for example) so that the programmer can be alerted to clarify what was intended.

A. Standard library

At present the default implementation contains only a very small fraction of the ISO C++ library, namely those headers - <exception>, <new> and <typeinfo> - which are an integral part of the language specification. These headers are also those which require the most cooperation between the producer and the library implementation, as described in C/C++ Producer Implementation.

It is suggested that if further library components are required then they be acquired from third parties. It should be noted however that such libraries may require some effort to be ported to an ISO compliant compiler; for example, some information on porting the libio component of libg++, which contains some very compiler-dependent code, are given in the C++ and Portability document. Libraries compiled with other C++ compilers may not link correctly with modules compiled using tcc.