sid(1) – TenDRA

Name

sid — Syntax Improving Device; parser generator

Synopsis

sid

sid

{ -h | -? | --help | -v | --version }

Description

The sid command is used to turn descriptions of a language into a program for recognising that language. This manual page details the command line syntax; for more information, consult the sid user documentation.

The number of files specified on the command line varies depending upon the output language. The description of the --language option specifies the number of files for each language.

Switches

sid accepts both short form and long form command line switches. The long form equivalents are due to be removed in the next release.

Short form switches are single characters, and begin with a - or + character. They can be concatentated into a single command line word, e.g.:

-vdl  dump-file  lang

which contains three different switches (-v, which takes no arguments; -d, which takes one argument: dump-file; and -l, which takes one argument: lang).

Long form switches are strings, and begin with -- or ++. With long form switches, only the shortest unique prefix need be entered. The long form of the above example would be:

--version  --dump-file  dump-file  --language  lang

In most cases the arguments to the switch should follow the switch as a separate word. In the case of short form switches, the arguments to the short form switches in a single word should follow the word in the order of the switches (as in the first example). For some options, the argument may be part of the same word as the switch (such options are shown without a space between the switch and the argument in the switch summaries below). In the case of short form switches, such a switch would terminate any concatentation of switches (either a character would follow it, which would be treated as its argument, or it would be the end of the word, and its argument would follow as normal).

For binary switches, the - or -- switch prefixes set (enable) the switch, and the + or ++ switch prefixes reset (disable) the switch. This is probably back to front, but is in keeping with other programs. The switches -- or ++ by themselves terminate option parsing.

Options

sid accepts the following command line options:

--dump-file file

-d file

This option causes intermediate dumps of the grammar to be written to the file file. The format of the dump files is similar to the format of the grammar specification, with the following exceptions:

Predicates are written with the predicate result replaced by the predicate identifier (this will always be zero), and the result is followed by a ? to indicate that it was a predicate. As an example, the predicate:
```
( b, ? ) = <pred> ( a )
```
would be printed out as:
```
( b : Type1T, 0 : Type2T ) ? = <pred> ( a : Type3T )
```
Items that are considered to be inlinable are prefixed by a +. Items that are tail calls which will be eliminated are prefixed by a *.
Nested rules are written at the outer level, with names of the form outer-rule::....::inner-rule.
Types are provided on call parameter and result tuples.
Inline rules are given a generated name, and are written out as a call to the generated rule (and a definition elsewhere).

--factor-limit file

-f file

This option limits the number of rules that can be created during the factorisation process. It is probably best not to change this.

-?

-h

--help

This option writes a summary of the command line options to the standard error stream.

--inline inlines

-i inlines

This option controls what inlining will be done in the output parser. The inlines argument should be a comma seperated list of the following words:

SINGLES: This causes single alternative rules to be inlined. This inlining is no longer performed as a modification to the grammar (it was in version 1.0).
BASICS: This causes rules that contain only basics (and no exception handlers or empty alternatives) to be inlined. The restriction on exception handlers and empty alternatives is rather arbitrary, and may be changed later.
TAIL: This causes tail recursive calls to be inlined. Without this, tail recursion elimination will not be performed.
OTHER: This causes other calls to be inlined wherever possible. Unless the MULTI inlining is also specified, this will be done only for productions that are called once.
MULTI: This causes calls to be inlined, even if the rule being called is called more than once. Turning this inlining on implies OTHER. Similarly turning off OTHER inlining will turn off MULTI inlining. For grammars of any size, this is probably best avoided; if used the generated parser may be huge (e.g. a C grammar has produced a file that was several hundred MB in size).
ALL: This turns on all inlining.

In addition, prefixing a word with no turns off that inlining phase. The words may be given in any case. They are evaluated in the order given, so:

-inline noall,singles

would turn on single alternative rule inlining only, whilst:

-inline singles,noall

would turn off all inlining. The default is as if sid were invoked with the option:

-inline noall,basics,tail

--language lang

-l lang

This option specifies the output language. Currently this should be one of the following:

Language	Input files	Output files	Description
`ansi-c`	`.sid`, `.act`	`.c`, `.h`	Generated parser in ANSI standard C.
`pre-ansi-c`	`.sid`, `.act`	`.c`, `.h`	Same as `ansi-c`, sans prototypes.
`test`	`.sid`	–	For testing grammars.
`bnf`	`.sid`	`.bnf`	BNF grammar output.

The default is ansi-c.

The ansi-c and pre-ansi-c languages are basically the same. The only difference is that ansi-c initially uses function prototypes, and pre-ansi-c doesn't. Each language takes two input files, a grammar file and an actions file, and produces two output files, a C source file containing the generated parser and a C header file containing the external declarations for the parser.

The test language only takes one input file, and produces no output file. It may be used to check that a grammar is valid. In conjunction with the dump file, it may be used to check the transformations that would be applied to the grammar. There are no language specific options for the test language.

The bnf language takes one input file (the grammar) and produces one output file. It is intended to provide a convenient means to convert sid grammars into other formats, or to inspect their contents in a simpler form.

--switch opt

-s opt

Pass through opt as a language-specific option. The valid options are described below.

--tab-width num

-t num

This option specifies the number of spaces that a tab occupies. It defaults to 8. It is only used when indenting output.

--version

-v

This option causes the version number and supported languages to be written to the standard error stream.

Language-specific options for C

The following language-specific options apply for the ansi-c and pre-ansi-c languages: