3. Configuration for lexical analysis

  1. 3.1. Lexical analysis
  2. 3.2. Keywords
  3. 3.3. Nested comments
  4. 3.4. Identifier names
  5. 3.5. Identifier name length

3.1. Lexical analysis

During lexical analysis, a source file which is not empty should end in a newline character. It is possible to relax this constraint using the directive:

#pragma TenDRA no nline after file end allow

3.2. Keywords

In several places in this section it is described how to introduce keywords for TenDRA language extensions. By default, no such extra keywords are defined. There are also low-level directives for defining and undefining keywords. The directive:

#pragma TenDRA++ keyword identifier for keyword identifier

can be used to introduce a keyword (the first identifier) standing for the standard C++ keyword given by the second identifier. The directive:

#pragma TenDRA++ keyword identifier for operator operator

can similarly be used to introduce a keyword giving an alternative representation for the given operator or punctuator, as, for example, in:

#pragma TenDRA++ keyword and for operator &&

Finally the directive:

#pragma TenDRA++ undef keyword identifier

can be used to undefine a keyword.

3.3. Nested comments

C-style comments do not nest. The directive:

#pragma TenDRA nested comment analysis on

enables a check for the characters /* within C-style comments.

The occurence of the /* characters inside a C comment, i.e. text surrounded by the /* and */ symbols, is usually a mistake and can lead to the termination of a comment unexpectedly. By default such nested comments are processed silently, however an error or warning can be produced by setting:

#pragma TenDRA nested comment analysis status

with status as on or warning. If status is off the default behaviour is restored.

3.4. Identifier names

During lexical analysis, each character in the source file has an associated look-up value which is used to determine whether the character can be used in an identifier name, is a white space character etc. These values are stored in a simple look-up table. It is possible to set the look-up value using:

#pragma TenDRA++ character character-literal as character-literal allow

which sets the look-up for the first character to be the default look-up for the second character. The form:

#pragma TenDRA++ character character-literal disallow

sets the look-up of the character to be that of an invalid character. The forms:

#pragma TenDRA++ character string-literal as character-literal allow
#pragma TenDRA++ character string-literal disallow

can be used to modify the look-up values for the set of characters given by the string literal. For example:

#pragma TenDRA character '$' as 'a' allow
#pragma TenDRA character '\r' as ' ' allow

allows $ to be used in identifier names (like a) and carriage return to be a white space character. The former is a common dialect feature and can also be controlled by the directive:

#pragma TenDRA dollar as ident allow

The ISO C standard (Section 6.1) states that the use of the character $ in identifier names is illegal. The pragma:

#pragma TenDRA dollar as ident allow

can be used to allow such identifiers, which by default are flagged as errors. There is also a disallow variant which restores the default behaviour.

3.5. Identifier name length

Under the ISO C standard rules on identifier name length, an implementation is only required to treat the first 31 characters of an internal name and the first 6 characters of an external name as significant. The TenDRA C checker provides a facility for users to specify the maximum number of characters allowed in an identifier name, to prevent unexpected results when the application is moved to a new implementation.

The maximum number of characters allowed in an identifier name can be set using the directives:

#pragma TenDRA set name limit integer-literal
#pragma TenDRA++ set name limit integer-literal warning

This length is given by the name_limit implementation quantity mentioned above. Identifiers which exceed this length raise an error or a warning, but are not truncated.

#pragma TenDRA set name limit integer_constant

There is currently no distinction made between external and internal names for length checking. Identifier name lengths are not checked in the default mode.