5. Features

  1. 5.1. Predicates
  2. 5.2. Error handling
  3. 5.3. Call by reference
  4. 5.4. Calling entry points

This chapter draws attention to a few of the more interesting features of sid-specified grammars, and how they may be used.

5.1. Predicates

Predicates provide the user with a mechanism for altering the control flow in a manner that terminals alone cannot do.

During the factorisation process, rules that begin with predicates are expanded if necessary to ensure that predicates that may be used to select which alternative to go down always begin the alternative, e.g.:

rule1 = {
	rule2 ;
	/* .... */
||
	/* .... */
} ;

rule2 = {
	? = <predicate> ;
	/* .... */
||
	/* .... */
} ;

would be expanded into:

rule1 = {
	? = predicate ;
	/* .... */
	/* .... */
||
	/* .... */
	/* .... */
||
	/* .... */
} ;

Also, if a predicate is used to select which alternative to use, it must be the first thing in the alternative, so the following would not be allowed:

rule = {
	<action> ;
	? = <predicate> ;
	/* .... */
||
	/* .... */
} ;

When predicates begin a rule, they are executed (in some arbitrary order) until one of them returns true. The alternative that this predicate begins is then selected. If no predicates return true, then one of the remaining alternatives is selected based upon the current terminal (or an error occurs).

It is important that predicates do not contain dependencies upon the order of evaluation. In practice, predicates are likely to be simple, so this shouldn't be a problem.

When predicates are used within an alternative, they behave like terminals. If they evaluate to true, then parsing continues. If they evaluate to false, then an exception is raised.

5.2. Error handling

If the input given to the parser is valid, then the parser will not need to produce any errors. Unfortunately this is not always the case, so sid provides a mechanism for handling errors.

When an error occurs, an exception is raised. This passes control to the nearest enclosing exception handler. If there is no exception handler at all, the entry point function will return with the current terminal set to the error value.

An exception handler is just an alternative that is executed when a terminal or predicate fails. This should obviate the need to rely upon language specific mechanisms (such as setjmp and longjmp) for error recovery.

5.3. Call by reference

The default behaviour of sid is to do argument passing using call by copy semantics, and to not allow mutation of parameters of rules and actions (however inlined rules, and rules created during factoring have call by reference parameters). However it is possible to give rule and action parameters call by reference semantics, using the & symbol in the type specification (as described earlier). It is also possible to mutate parameters of actions, using the @= substitution in the action body (also described earlier). It is important to do the correct substitutions in action definitions, as sid uses this information to decide where it can optimise the output code.

If a call by copy parameter is mutated, then sid will introduce a new temporary variable and copy the parameter into it - this temporary will then be mutated. Similar code will be output for rules that have call by copy parameters that are mutated (e.g. as a call by reference argument to an action that mutates its parameters).

5.4. Calling entry points

When calling a function that implements an entry point rule, it should be called with the rule's parameters as the first arguments, followed by the addresses of the rule's results as the remaining arguments. The parameters should have their addresses passed if they are of a type that has a parameter assignment operator defined, or if the parameter is a call by reference parameter.

For example, given the following rule:

rule1 : ( :Type1T, :Type2T, :Type3T & ) -> ( :Type4T ) ;

where Type2T has a parameter assignment operator defined, and rule1 is mapped to rule1 (and the type names are mapped to themselves), the call would be something like:

Type1T a = make_type1 () ;
Type2T b = make_type2 () ;
Type3T c = make_type3 () ;
Type4T d ;

rule1 ( a, b, &c, &d ) ;