Name

tdfc2charset — tdfc2 execution character set syntax

Lexical conventions

Input is line delimited, and a directive must be completed within a single line. Whitespace within a line is not syntactically significant. Directives are terminated by the statement separator ;.

Comments begin with # and run to the end of the line.

The following lexical conventions are used within this reference:

SymbolMeaning
cA single character (including single quote, “'”)
\...An escape sequence (see below).
vA value in the execution character set, in base 10.
nA numerical digit in an appropriate base (case insensitive)
0..9
9..0
A range of ASCII values in base 10, in ascending and descending order, respectively.
'A'..'Z'
'Z'..'A'
A range of characters in ascending descending order, respectively.

Character ranges may be either digits ('0'..'9'), uppercase ('A'..'Z') or lowercase ('a'..'z'), but not a mix. Any endpoints within a range may be specified. For example 'A'..'Z' could be written as '3'..'7' specifically. All ranges are inclusive.

The following character escapes are permitted:

EscapeMeaning
\\Backslash
\'Single quote
\tHorizontal tab
\vVertical tab
\fForm feed
\aAlert
\bBackspace
\rCarriage return
\nNew line
\0nnAn octal ASCII value of zero or more digits
\xnnA hexadecimal ASCII value of one or more digits
\nnnA decimal ASCII value of one or more digits

Directives

Directives of the form x -> y are concerned with mapping ASCII characters (on the left) to corresponding values in the execution character set (on the right). Note these replace any previous mappings.

'c'    -> v;
'\c'   -> v;
'\0nn' -> v;
'\xnn' -> v;
'\nnn' -> v;

Map a single character literal or ASCII value to an arbitrary execution set value.

'A'..'Z' -> 0..9;
'Z'..'A' -> 0..9;

Map a range of characters to a range of execution set values. The arity of these ranges must match.

Note the ASCII character ranges include digits and lowercase.

Rearranging mapped ranges:

permute 0..9;

Randomly permute (shuffle) the values within the given range.

reverse 0..9;

Reverse the order of the values within the given range.

rotate 0..9 by n;

Rotate the values within the given range by n elements. Values are rotated as in a circular queue, and values pushed off one side wrap around to the opposing side. n may be positive (to rotate to the right) or negative (to the left).

slide 0..9 in 0..9;

Rotate the containing range by a random number of elements, such that the inner range is moved at most to either end, but will never wrap over the edge.

fill 0..9 in 0..9;

Populate the inner range with arbitrary values not present in the containing range.

Conformance

Each directive is interpreted in order, and operates on the current mapping in turn. It is permitted to have (for example) duplicate values and other non-legal constructs whilst building the mapping, but the final result is expected to conform to the requirements for the execution character set as outlined in section 2.2.1 of the ANSI C89 standard.

Briefly, these requirements are that '\0' must be mapped to value 0, that the numeric characters '0' to '9' are mapped to contiguous values, and that various characters (specified by the C standard) are required to be present. No duplicate values are permitted.

Caveats

Only execution character sets in the range 0..255 inclusive are currently supported. Likewise source character set values are only supported in the same range, despite that the compiler uses Unicode for its source character set.

No attempt is made to cater for locale or for wide characters, despite being supported by the compiler itself.