The module STRING declares sorts and operators for manipulating strings of characters. Strings of length one form a subsort Char of String. Operations on strings are based on the SGI rope package [5], which has been optimized for functional programming, where copying with modification is supported efficiently, whereas arbitrary in-place updates are not.
Strings are input and output using the usual convention of enclosing the string
characters in a pair of matching quotes "...". When a string is
parsed, it is interpreted using a subset of ANSI C backslash escape
conventions [47, Section A2.5.2].
To define the results of searching a string for an occurrence of another substring the sort FindResult is introduced. This sort consists of the natural numbers, returned as the index in the string where a found substring begins (string indexing begins with 0), and a special constant notFound, returned if no occurrence is found.
fmod STRING is
protecting NAT .
sorts String Char FindResult .
subsort Char < String .
subsort Nat < FindResult .
op <Strings> : -> Char [special (...)] .
op <Strings> : -> String [ditto] .
op notFound : -> FindResult [ctor] .
The operators ascii and char convert between characters and ASCII codes.
*** conversion between ascii code and character
op ascii : Char -> Nat [special (...)] .
op char : Nat ~> Char [special (...)] .
For a natural number n less than 256 and a character c,
we have
and
.
For a natural number n greater than 255, char(n) is
an error term of kind [String]. For example,
Maude> red in STRING : ascii("#") .
result NzNat: 35
Maude> red char(35) . result Char: "#"
Maude> red ascii("a") .
result NzNat: 97
Maude> red char(97) . result Char: "a"
Maude> red char(255) . result Char: "\377"
On strings, _+_ denotes the concatenation operation, with identity the
empty string, "". String length is computed by the length
operator.
*** string concatenation
op _+_ : String String -> String
[prec 33 gather (E e) special (...)] .
*** string length
op length : String -> Nat [special (...)] .
Here are some examples.
Maude> red in STRING : "abc" + "def" . result String: "abcdef"
Maude> red "ab" + "cd" + "ef" . result String: "abcdef"
Maude> red "abc" + "" . result String: "abc"
Maude> red length("abcdef") .
result NzNat: 6
Maude> red length("") .
result Zero: 0
The operators substr, find, and rfind deal with finding and extracting substrings. Remember that string indexing begins with 0.
*** substring
*** second argument is starting position, third is length
op substr : String Nat Nat -> String [special (...)] .
*** starting position of substring (second argument)
*** least one >= third argument (find)
*** greatest one <= third argument (rfind)
op find : String String Nat -> FindResult [special (...)] .
op rfind : String String Nat -> FindResult [special (...)] .
The expression substr(S:String, Start:Nat, Len:Nat)
returns the substring of S:String of length Len:Nat beginning
at position Start:Nat. If the value of the term Start:Nat + Len:Nat is greater
than length(S:String) then the returned substring is the tail of
S:String starting from position Start:Nat. This will be empty
if the starting position is past the end of the string.
Maude> red in STRING : substr("abc", 0, 2) .
result String: "ab"
Maude> red substr("abc", 1, 2) .
result String: "bc"
Maude> red substr("abc", 1, 3) .
result String: "bc"
Maude> red substr("abc", 3, 2) .
result String: ""
find searches for the first match from the beginning of the string, while rfind searches from the end of the string backwards.
find(S:String, Pat:String, Start:Nat) returns the least index of an
occurrence of Pat:String in S:String that is greater than or
equal to Start:Nat. If no such index exists the constant
notFound is returned.
rfind(S:String, Pat:String, Start:Nat) returns the greatest index of an
occurrence of Pat:String in S:String that is less than or
equal to Start:Nat. If no such index exists the constant
notFound is returned.
Maude> red in STRING : find("abc", "b", 0) .
result NzNat: 1
Maude> red find("abc", "b", 1) .
result NzNat: 1
Maude> red find("abc", "b", 2) .
result FindResult: notFound
Maude> red find("abc", "d", 2) .
result FindResult: notFound
Maude> red rfind("abc", "b", 2) .
result NzNat: 1
Maude> red rfind("abc", "b", 1) .
result NzNat: 1
Maude> red rfind("abc", "b", 0) .
result FindResult: notFound
Maude> red rfind("abc", "d", 2) .
result FindResult: notFound
Some properties relating substr, find, and rfind are
the following, where
and
are variables of sort
String, and
,
, and
are variables of
sort Nat such that length(
) =
and
length(
) =
.
The operators _<_, _<=_, _>_, and _>=_ denote
string comparison operations using the lexicographic order, where characters
are compared going through their ASCII codes.
*** lexicographic string comparison
op _<_ : String String -> Bool [prec 37 special (...)] .
op _<=_ : String String -> Bool [prec 37 special (...)] .
op _>_ : String String -> Bool [prec 37 special (...)] .
op _>=_ : String String -> Bool [prec 37 special (...)] .
endfm
Here are some examples.
Maude> red in STRING : "abc" < "abd" . result Bool: true
Maude> red "abc" < "abb" . result Bool: false
Maude> red "abc" < "abcd" . result Bool: true