A. Appendix

  1. A.1. Undefined Behaviour
  2. A.2. Obscure Features
    1. A.2.1. Ranges Within Tokens
    2. A.2.2. Whitespace Within Tokens and Keywords

A.1. Undefined Behaviour

Undefined behaviours are actions that generate output which may be invalid, nonsensical, undesired or simply legal but obscure.

The following constructs are syntactically legal input to Lexi, but produce effects which are undefined behaviour. They may be disallowed entirely in future versions and should be avoided. This particular release of Lexi permits these as undefined to offer a transition period only.

Mapping to the start of another mapping

To define a mapping which produces a character used at the start of any mapping, including itself. For example:

MAPPING "???" -> "?" ;
Calling one function from multiple different-length tokens

For example:

TOKEN "//" -> get_comment () ;
TOKEN "#"  -> get_comment () ;

One workaround is to call two macros of different prototypes which call the same function.

A.2. Obscure Features

This section describes legal constructs of doubtful interest.

A.2.1. Ranges Within Tokens

The grammar for Lexi permits strings formed of pre-defined character ranges to be used for token definitions:

TOKEN {A-Z} + "something" -> $x ;

Character ranges are intended to be used for sets (that is, in GROUP), not sequences. Since tokens are defined using sequences, this really means:

TOKEN "ABCDEFGHIJKLMNOPQRSTUVWXYZsomething" -> $x ;

This is probably not the intended effect. Using a group is suggested instead:

GROUP alpha = {A-Z} ;
TOKEN "[alpha]something" -> $x ;

A.2.2. Whitespace Within Tokens and Keywords

The white group may be used in tokens and keywords as any other group would:

TOKEN "a[white]" -> $something ;
TOKEN "[white]ab" -> $neverscanned ;

The above are both legal tokens definitions. The second one will never be scanned since white characters are discarded before tokens are matched.