Prev Up
Go backward to 2.7.5 Default Precedence and Gathering
Go up to 2.7 Parsing, Bubbles and Meta-Parsing

2.7.6 Tokens, Bubbles and Metaparsing

In order to generate in Maude an environment for a language L, including the case of a language with user-definable syntax, the first thing we need to do is to define the syntax for L-modules. This can be done by extending the module META-LEVEL with a data type ModuleL for L-modules, and with other auxiliary data types for commands and other constructs. Maude provides great flexibility to do this thanks to its mixfix front-end and to the use of bubbles19 (any nonempty string of Maude identifiers). The intuition behind bubbles is that they correspond to pieces of a module in a language that can only be parsed once the grammar introduced by the signature of the module is available.

The idea is that for a language that allows modules with user-definable syntax--as it is the case for Maude--it is natural to see its syntax as a combined syntax, at two different levels: what we may call the top level syntax of the language, and the user-definable syntax introduced in each module. The data type bubble allows us to reflect this duality of levels in the syntax definition. Similar ideas have been exploited using ASF+SDF [59].

To illustrate this concept, suppose that we want to define the syntax of Maude in Maude. Consider the following Maude module:

  fmod NAT3 is
     sort Nat3 .
     op s_ : Nat3 -> Nat3 .
     op 0 : -> Nat3 .
     eq |s s s 0| = |0| .
  endfm

Notice that the strings of characters inside the boxes are not part of the top level syntax of Maude. In fact, they can only be parsed with the grammar associated to the signature of the module NAT3. In this sense, we say that the syntax for Maude modules is a combination of two levels of syntax. The term s s s 0, for example, has to be parsed in the grammar associated to the signature of NAT3. The definition of the syntax of Maude in Maude must reflect this duality of syntax levels.

So far, we have talked about bubbles in a generic way. In fact, there can be many different kinds of bubbles. In Maude we can define different types of bubbles as built-in data types by parameterizing their definition. Thus, for example, a bubble of length one, which we call a token, can be defined as follows.

    sort Token .
 
    op token : Qid -> Token 
           [special
               (id-hook Bubble        (1 1)
                op-hook qidBaseSymbol (<Qids> : -> Qid))] .
Any name can be used to define a bubble sort. It is the special attribute
               id-hook Bubble        (1 1)
that makes the sort Token a bubble sort. The second argument of the id-hook special attribute indicates the minimum and maximum length of such bubbles as strings of identifiers. Therefore, Token has only bubbles of size 1. To specify a bubble of any length we would use the pair of values 1 and -1. The operator used in the declaration of the bubble, in this case the operator token, is a bubble constructor that represents tokens in terms of their quoted form. For example, the token |abc123| is represented as token('abc123).

We can define bubbles of any length, that is, nonempty sequences of Maude identifiers, with the following declarations.

    sort Bubble .

    op bubble : QidList -> Bubble 
           [special
               (id-hook Bubble        (1 -1)
                op-hook qidListSymbol 
                        (__ : QidList QidList -> QidList)
                op-hook qidBaseSymbol (<Qids> : -> Qid))] .
In this case, the system will represent the bubble as a list of quoted identifiers under the constructor bubble. For example, the bubble |ab cd ef| is represented as bubble('ab 'cd 'ef).

Different types of bubbles can be defined using the id-hook special attribute Exclude, which takes as parameter a list of identifiers to be excluded from the given bubble, that is, the bubble being defined cannot contain such identifiers.

Suppose that instead of the declarations given in the module MINI-MAUDESYNTAX below, we had instead given the following two declarations to specify the syntax of an operator declaration in a module.

    op op_: ->_. : Token Token -> Decl .
    op op_:_->_. : Token Bubble Token -> Decl .
With these declarations, bubbles could be bigger than expected. In particular, the following parse would be possible.
  op |s_| : |Nat3 -> Nat3 . op 0 :| -> |Nat3| .

We can use the id-hook special attribute Exclude to avoid this situation. We can declare the sort NeTokenList with constructor neTokenList as a list of identifiers, of any length greater than one, excluding20 the identifier `.' with the following declarations.

    sort NeTokenList .

    op neTokenList : QidList -> NeTokenList 
           [special
               (id-hook Bubble        (1 -1)
                op-hook qidListSymbol 
                        (__ : QidList QidList -> QidList)
                op-hook qidBaseSymbol (<Qids> : -> Qid)
                id-hook Exclude (.))] .

We are now ready to give the signature to parse modules such as NAT3 above. The following module MINI-MAUDE-SYNTAX uses the above definitions of sorts Token, Bubble and NeTokenList to define the syntax of a sublanguage of Maude, namely, many sorted, unconditional, functional modules, in which the declarations of sorts, variables and operators have to be done one at a time, and in which no attributes are supported for operators.

  fmod MINI-MAUDE-SYNTAX is
    including QID-LIST .
    sorts Bubble Token NeTokenList
          PreModule PreCommand
          Decl DeclList .
    subsort Decl < DeclList . 
 
    op token : Qid -> Token 
           [special
               (id-hook Bubble        (1 1)
                op-hook qidBaseSymbol (<Qids> : -> Qid))] .
    op bubble : QidList -> Bubble 
           [special
               (id-hook Bubble        (1 -1)
                op-hook qidListSymbol 
                        (__ : QidList QidList -> QidList)
                op-hook qidBaseSymbol (<Qids> : -> Qid))] .
    op neTokenList : QidList -> NeTokenList 
           [special
               (id-hook Bubble        (1 -1)
                op-hook qidListSymbol 
                        (__ : QidList QidList -> QidList)
                op-hook qidBaseSymbol (<Qids> : -> Qid)
                id-hook Exclude (.))] .
 
    *** sort declaration
    op sort_. : Token -> Decl .
 
    *** operator declaration
    op op_: ->_. : Token Token -> Decl .
    op op_:_->_. : Token NeTokenList Token -> Decl .
 
    *** variable declaration
    op var_:_. : Token Token -> Decl .
 
    *** equation declaration
    op eq_=_. : Bubble Bubble -> Decl .
 
    *** functional module
    op fmod_is_endfm : Token DeclList -> PreModule .
    op __ : DeclList DeclList -> DeclList [assoc gather(e E)] .
  endfm
Notice how we explicitly declare operators that correspond to the top level syntax of Maude, and how we represent as terms of sort Bubble those pieces of the module--namely, terms in equations--that can only be parsed with the user-defined syntax.

Then, the functional module NAT3 above can be parsed as a term of sort PreModule in MINI-MAUDE-SYNTAX. The name of this sort reflects the fact that not all terms of sort PreModule do actually represent Maude modules. In particular, for a term of sort PreModule to represent a Maude module all the bubbles must be correctly parsed as terms in the module's user-defined syntax.

When calling the function meta-parse with the meta-representation of the module MINI-MAUDE-SYNTAX21 and the previous module transformed into a list of quoted identifiers22, that is,

  Maude> red meta-parse( |¯MINI-MAUDE-SYNTAX¯| ,
                        'fmod 'NAT3 'is
                            'sort 'Nat3 '.
                            'op 's_ ': 'Nat3 '-> 'Nat3 '.
                            'op '0 ': '-> 'Nat3 '.
                            'eq 's 's 's '0 '= '0 '.
                        'endfm) .

we get the following metaterm as a result.

  result Term: 
     'fmod_is_endfm[
         |'NAT3|,
         '__['sort_.[|'Nat3|],
             '__['op_:_->_.[|'s_|, |'Nat3|, |'Nat3|],
                 '__['op_:`->_.[|'0|, |'Nat3|],
                     'eq_=_.[|'s 's 's '0|, |'0|]]]]]

Of course, Maude does not return these boxes. Instead, the system returns the bubbles using their constructor form as specified in their corresponding declarations. For example, the bubbles |'Nat3| and |'s 's 's '0| are represented, respectively, as token('Nat3) and bubble('s 's 's '0). Maude returns them meta-represented. The result given by Maude is therefore the following.

  result Term:
     'fmod_is_endfm[
        'token[{''NAT3}'Qid], 
        '__['sort_.['token[{''Nat3}'Qid]], 
            '__['op_:_->_.['token[{''s_}'Qid], 
                           'neTokenList[{''Nat3}'Qid], 
                           'token[{''Nat3}'Qid]], 
                '__['op_:`->_.['token[{''0}'Qid], 
                               'token[{''Nat3}'Qid]], 
                    'eq_=_.['bubble['__[{''s}'Qid, {''s}'Qid,  
                                        {''s}'Qid, {''0}'Qid]], 
                            'bubble[{''0}'Qid]]]]]]

This result is a metaterm of sort Term. To convert this term into a term of sort FModule is now straightforward. As already mentioned, we first have to extract from the term the module's signature. For this, we can use an equationally defined function

   op extractSignature : Term -> FModule? .
Notice that extractSignature is a partial function that is not well defined for metaterms of sort Term that do not meta-represent terms of sort PreModule in MINI-MAUDE-SYNTAX. Therefore, the result is in general an element of an error supersort FModule? of FModule. Once we have the signature of the module--expressed as a functional module with no equations and no membership axioms--we can then build terms of sort FModule in the same way with the equationally defined function solveBubbles, that recursively replaces each bubble in an equation by the result of calling meta-parse with the already extracted signature and with the quoted identifier form of the bubble.
   op solveBubbles : Term FModule -> FModule? .
The partial function processPreModuleTerm takes a term and, if it has the appropriate form--that is, if it is a term meta-representing a term of sort PreModule in MINI-MAUDE-SYNTAX, and, furthermore, the solveBubbles function succeeds in parsing the bubbles in equations as terms--then it returns a term of sort FModule.
   op processPreModuleTerm : Term -> FModule? .
   
   eq processPreModuleTerm(T)
     = solveBubbles(T, extractSignature(T)) .

We have then the following reduction.
red processPreModuleTerm(
meta-parse( |¯MINI-MAUDE-SYNTAX¯| ,
'fmod 'NAT3 'is
'sort 'Nat3 '.
'op 's_ ': 'Nat3 '-> 'Nat3 '.
'op '0 ': '-> 'Nat3 '.
'eq 's 's 's '0 '= '0 '.
'endfm)) .

  Result FModule : fmod 'NAT3 is 
                      nil 
                      sorts 'Nat3 . 
                      none 
                      op '0 : nil -> 'Nat3 [none] . 
                      op 's_ : 'Nat3 -> 'Nat3 [none] . 
                      none 
                      none 
                      eq 's_['s_['s_[{'0}'Nat3]]] = {'0}'Nat3 .
                   endfm

Prev Up