next up previous contents
Next: Default precedence values Up: Syntax and Basic Parsing Previous: Terms and preregularity   Contents


Parsing

As seen in previous sections, the Maude language supports user-definable syntax including mixfix operator declarations. Parsing is done in stages using a bison/flex-based parser for Maude's surface syntax, a grammar generator which generates the context-free grammar for the user-defined mixfix parts of a Maude module over the user's signature, and the MSCP context-free parser (generator) that generates a parser for the module's context-free grammar. MSCP was developed by J. Quesada [68,67].

With mixfix syntax, the occurrence of ambiguities in the parsing of terms is very common. Of course, we can always provide unambiguous grammars, which are frequently surprisingly large, or use parentheses for breaking the possible ambiguities. But usually we would like to have a more powerful alternative. Maude reduces such ambiguities by using a mechanism based on precedence values and gathering patterns.

Let us assume the following declarations for some arithmetic expressions:

  sort Nat .
  ops 1 2 3 : -> Nat .
  ops _+_ _*_ : Nat Nat -> Nat .

An expression like 1 + 2 * 3 is ambiguous, since both (1 + 2) * 3 and 1 + (2 * 3) are valid parses. This kind of ambiguity is usually solved by assigning a precedence to each of the operators. In Maude, the precedence of an operator is given by a natural number,3.3where a lower value indicates a tighter binding.

Operator precedence then defines how an expression should be parsed when several operators are present. We can assign a precedence to an operator with a precedence (abbreviated prec) attribute, which takes the precedence value as an argument. For example, one would expect multiplication to be evaluated before addition. Thus, we can give precedences, e.g., $33$ and $31$ to the operators _+_ and _*_, respectively, as follows:

  op _+_ : Nat Nat -> Nat [prec 33] .
  op _*_ : Nat Nat -> Nat [prec 31] .

The term 1 + 2 * 3 is now unambiguous: its only possible parse is 1 + (2 * 3).

Precedence can be overridden using parentheses; we can always write (1 + 2) * 3 in case this is the term we are interested in. For those operators for which the user does not specify a precedence value, a default one is given (see Section 3.9.1 for a discussion on the default precedence values). For example, both operators _+_ and _*_ above get $41$ as their default precedence, and hence the ambiguity.

The precedence mechanism is not enough, however. For example, the expression 1 + 2 + 3 is still ambiguous, because both parses (1 + 2) + 3 and 1 + (2 + 3) are possible. Usually, programming languages define a way of associating operators to solve this kind of problems, so that the associativity of the operators determines which is evaluated first. For example, addition usually is left-associative, and therefore we expect to parse it as (1 + 2) + 3. In Maude, we can specify not only the associativity of operators, but general gathering patterns for each operator.

The gathering pattern of an operator restricts the precedences of terms that are allowed as arguments. We give a (non-empty) sequence of as many E, e, or & values as the number of arguments in the operator, that is, one of these values for each argument position:

In fact, the precedence values work because of their combination with the gathering patterns. For example, the precedence values given to _+_ and _*_ work as expected because their default gathering pattern is (E E) (see Section 3.9.2), which forces them to be applied only to terms of smaller or equal precedence value. Thus, 1 + (2 * 3) is a valid parse for 1 + 2 * 3. On the other hand, since the precedence of a term is given by the precedence of its top operator, (1 + 2) * 3 is not a valid parse for 1 + 2 * 3, because the term 1 + 2 has precedence value 33, which is greater than the precedence of _*_.

Moreover, by default, all constants have precedence $0$ (see Section 3.9.1), and therefore they are also valid arguments for both operators.

We can specify _+_ and _*_ as left-associative by giving to them gathering pattern (E e).

  op _+_ : Nat Nat -> Nat [prec 33 gather (E e)] .
  op _*_ : Nat Nat -> Nat [prec 31 gather (E e)] .

In this way, we force the second argument of these operators to be of a strictly lower precedence. Then, a term with _+_ as top operator (or any other operator with the same precedence) like 2 + 3 is nonvalid as second argument for _+_. But it would be valid as first argument, since terms with equal precedence are allowed. Now the only possible parse for the expression 1 + 2 + 3 is (1 + 2) + 3.

Note that parentheses could be described as an operator (_) with precedence $0$ and gathering pattern (&). Thus, any term can appear inside parentheses, and any subterm of a term can be enclosed in parentheses.



Subsections
next up previous contents
Next: Default precedence values Up: Syntax and Basic Parsing Previous: Terms and preregularity   Contents
The Maude Team