next up previous contents
Next: String and number conversions Up: Predefined Data Modules Previous: Floating-point numbers   Contents


Strings

The module STRING declares sorts and operators for manipulating strings of characters. Strings of length one form a subsort Char of String. Operations on strings are based on the SGI rope package [5], which has been optimized for functional programming, where copying with modification is supported efficiently, whereas arbitrary in-place updates are not.

Strings are input and output using the usual convention of enclosing the string characters in a pair of matching quotes "...". When a string is parsed, it is interpreted using a subset of ANSI C backslash escape conventions [47, Section A2.5.2].

To define the results of searching a string for an occurrence of another substring the sort FindResult is introduced. This sort consists of the natural numbers, returned as the index in the string where a found substring begins (string indexing begins with 0), and a special constant notFound, returned if no occurrence is found.

  fmod STRING is
    protecting NAT .
    sorts String Char FindResult .
    subsort Char < String .
    subsort Nat < FindResult .
    op <Strings> : -> Char [special (...)] .
    op <Strings> : -> String [ditto] .
    op notFound : -> FindResult [ctor] .

The operators ascii and char convert between characters and ASCII codes.

    *** conversion between ascii code and character
    op ascii : Char -> Nat [special (...)] .
    op char : Nat ~> Char [special (...)] .

For a natural number n less than 256 and a character c, we have $\texttt{ascii(char(n))} = \texttt{n}$ and $\texttt{char(ascii(c))} = \texttt{c}$. For a natural number n greater than 255, char(n) is an error term of kind [String]. For example,

  Maude> red in STRING : ascii("#") .
  result NzNat: 35

  Maude> red char(35) .
  result Char: "#"

  Maude> red ascii("a") .
  result NzNat: 97

  Maude> red  char(97) .
  result Char: "a"

  Maude> red char(255) .
  result Char: "\377"

On strings, _+_ denotes the concatenation operation, with identity the empty string, "". String length is computed by the length operator.

    *** string concatenation
    op _+_ : String String -> String 
        [prec 33 gather (E e) special (...)] .

    *** string length
    op length : String -> Nat [special (...)] .

Here are some examples.

  Maude> red in STRING : "abc" + "def" .
  result String: "abcdef"

  Maude> red "ab" + "cd" + "ef" .
  result String: "abcdef"

  Maude> red "abc" + "" .
  result String: "abc"

  Maude> red length("abcdef") .
  result NzNat: 6

  Maude> red length("") .
  result Zero: 0

The operators substr, find, and rfind deal with finding and extracting substrings. Remember that string indexing begins with 0.

    *** substring
    *** second argument is starting position, third is length
    op substr : String Nat Nat -> String [special (...)] .

    *** starting position of substring (second argument)
    *** least one >= third argument (find)
    *** greatest one <= third argument (rfind)
    op find : String String Nat -> FindResult [special (...)] .
    op rfind : String String Nat -> FindResult [special (...)] .

The expression substr(S:String, Start:Nat, Len:Nat) returns the substring of S:String of length Len:Nat beginning at position Start:Nat. If the value of the term Start:Nat + Len:Nat is greater than length(S:String) then the returned substring is the tail of S:String starting from position Start:Nat. This will be empty if the starting position is past the end of the string.

  Maude> red in STRING : substr("abc", 0, 2) .
  result String: "ab"

  Maude> red substr("abc", 1, 2) .
  result String: "bc"

  Maude> red substr("abc", 1, 3) .
  result String: "bc"

  Maude> red substr("abc", 3, 2) .
  result String: ""

find searches for the first match from the beginning of the string, while rfind searches from the end of the string backwards.

find(S:String, Pat:String, Start:Nat) returns the least index of an occurrence of Pat:String in S:String that is greater than or equal to Start:Nat. If no such index exists the constant notFound is returned.

rfind(S:String, Pat:String, Start:Nat) returns the greatest index of an occurrence of Pat:String in S:String that is less than or equal to Start:Nat. If no such index exists the constant notFound is returned.

  Maude> red in STRING : find("abc", "b", 0) .
  result NzNat: 1

  Maude> red find("abc", "b", 1) .
  result NzNat: 1

  Maude> red find("abc", "b", 2) .
  result FindResult: notFound

  Maude> red find("abc", "d", 2) .
  result FindResult: notFound

  Maude> red rfind("abc", "b", 2) .
  result NzNat: 1

  Maude> red rfind("abc", "b", 1) .
  result NzNat: 1

  Maude> red rfind("abc", "b", 0) .
  result FindResult: notFound

  Maude> red rfind("abc", "d", 2) .
  result FindResult: notFound

Some properties relating substr, find, and rfind are the following, where $S$ and $P$ are variables of sort String, and $I$, $J$, and $K$ are variables of sort Nat such that length($S$) = $K$ and length($P$) = $J$.


\begin{displaymath}\begin{array}{@{\hspace*{-1.5em}}l}
I \leq \texttt{find(}S, P...
...ttt{rfind(}S, P, I \texttt{)+} J, K\texttt{)} = S
\end{array} \end{displaymath}

The operators _<_, _<=_, _>_, and _>=_ denote string comparison operations using the lexicographic order, where characters are compared going through their ASCII codes.

    *** lexicographic string comparison
    op _<_ : String String -> Bool [prec 37 special (...)] .
    op _<=_ : String String -> Bool [prec 37 special (...)] .

    op _>_ : String String -> Bool [prec 37  special (...)] .
    op _>=_ : String String -> Bool [prec 37 special (...)] .
  endfm

Here are some examples.

  Maude> red in STRING : "abc" < "abd" .
  result Bool: true

  Maude> red "abc" < "abb" .
  result Bool: false

  Maude> red "abc" < "abcd" .
  result Bool: true


next up previous contents
Next: String and number conversions Up: Predefined Data Modules Previous: Floating-point numbers   Contents
The Maude Team