Primitive Pattern structures (functions/variables) 10/29/2002

        1. span(Nonempty-argument)
             span(" "): A run of blanks (max possible length)
             span("0123456789"): A string only of digits
             span("ABCDEFGHIJKLMNOPQRSTUVWXYZ"): A word (A run of uppercase letters)
           a. matches the longest string beginning at the current cursor
              position that consists solely of characters of
               the argument
           b. Argument must be nonnull.
           c. It must match at least one character or else it fails.

        2. break(Nonempty-argument)
             break(" "): Every thing up to but not including the next blank.
             break(",.;:!?"): Every thing up to the next punctuation mark.
             break("+-0123456789"): Every thing up to the next number.
           a. matches the longest string beginning at the current cursor
              position that does not include a character of the argument.
           b. The argument must be nonempty.
           c. must find a break character in the subject or else it fails.
           d. matches the null string if the subject character at the
              beginning cursor position is one of break characters.
                "Lamar University"  Break(" LU")        //Null string matched

        3. fail
           A variable containing a primitive pattern that always fails to match.
           Causes the scanner to back up to seek alternatives.

             &Anchor = 1;  &Trim = 1
             Pat = (Break("A") $ A | Break("B") $ B | Break("C") $ C)  fail
       Loop  String = " " input                 :F(End)
             A =
             B =
             C =
             String Pat
             (Differ(A)  Differ(B)  Ident(C))           :F(Loop)
       *     Else the input string has at least one A and one B but no C.
             String span(" ") =         ;* Drop leading blanks.
             output = String                            :(Loop)

        4. POS(N) and RPOS(N)
           a. Both match only the null string and used to identify the position
              of matched substring in a subject.
           b. They never cause the cursor to be moved, they just test its position.
                &Anchor = 1
                STR Span(" ") POS(6)    ;* to check if cursor position is just past
                                        ;* the 6th character of subject string.
                    Above pattern matching succeeds only if subject string has
                    exactly 6 leading blanks
                &Anchor = 1
                STR Span(" ") RPOS(6)
                    Above pattern matching succeeds if and only if the 6th character
                    from the end of the subject string is a non-blank and everything
                    preceding it is a blank like  "       654321"
                Entire = POS(0)  Pat  RPOS(0)
                STR Entire
                 // Above Pattern matching succeeds if and only if the entire
                    subject string does match

        5. Tab(N), Rtab(N), Rem
           a. Each of these matches a substring of zero or more characters.

              "SNOBOL4" Len(2) Tab(6)

                 In above example, Len(2) matches the first two chars of
                 the subject string, "SN", and Tab(6) matches the next four
                 chars, "OBOL", as Tab(N) generally matches a substring up
                 to and including Nth char of the subject starting from the current
                 cursor position.

              "SNOBOL4" Len(2) Rtab(2)

                 In above example, Len(2) matches the first two chars of the subject
                 string and Rtab(2) matches everything but the last two chars. And
                 hence it matches the three chars, "OBO".
                 Rtab(0) is particularly useful for matching everything to the end
                 of the subject string.
                 REM is used for Rtab(0)

              Last8 = Rtab(8)  Rem  .  LastEight
                 LastEight picks up the last eight chars of a subject string if
                 it is at least that long.

        6. ANY(Nonempty-argument), NotAny(Nonempty-argument)
           a. They each matches exactly one char.
           b. ANY() matches one of chars of the argument while
              NotAny() matches a char not in the argument.
           c.   ANY("AIOEU") is equivalent to
                "A" |  "I"  |  "O"  |  "E"  |  "U"
                But, it is faster than the latter.
           d. SNOBOL identifiers and labels can be defined as follows:
                Letters = "abcdefghijklmnopqrstuvwxyz"
                Letters = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"  Letters
                Digits  = "0123456789"
                Alphanum = Letters  Digits
                Identifier = Any(Letters) (Null  |  Span(Alphanum  "_."))
                Label =  Any(Alphanum) break(" ;")
                // A label starts with a letter or with a digit followed by
                   anything up to but not including a blank or a semicolon.

        7. ARB
           a. A pattern variable that matches zero or more chars.
           b. It can be defined as:
              Arb =  Null  |  len(1)  *arb
           c. So, it is composed of
                Null
                Len(1)  Null
                Len(1)  Len(1)  Null
                Len(1)  Len(1)  Len(1)  Null
                Len(1)  Len(1)  Len(1)  Len(1)  Null  and so on.
           d. It will match all above in that order until it fails
              simply because the size of subject substring cannot be increased.
           e. It should not be used as the first component of a pattern
              structure unless associated with a variable for value assignment.
                 Str  Arb Pat   // Not a good idea
                 Str  Pat       // Better idea and the same if unanchored.
           f. It should not be used to break fields out of a string if they are
              separated by known delimiters.
                 Str break(",") .  Field  ","  =
                 Str Arb  .        Field  ","  =
              Above are the same accomplishing the same. But the former is
              much faster.

        8. ArbNo(Patt): "Arbitrary number of" of Patt
           a. This pattern function matches zero or more consecutive occurrences
              of strings each of which matches the argument pattern.
           b. When encountered by the scanner in the forward direction,
              i.e., initially, it matches the Null string.
           c. When "backed into," it tries to increase the length of the substring
              matched by its argument.
                &Anchor = 1
                STR  ArbNo(Len(3))  RPOS(0)
              Above matching succeeds only if the length of STR is zero or
              a multiple of three.
           d. It can be defined as follows:
                 ArbNoPatt =  Null |  Patt *ArbNoPatt
           e. Hence, it will match the following in that order:
                Null
                Patt Null
                Patt Patt Null
                Patt Patt Patt Null
                Patt Patt Patt Patt Null     // and so on.
                Example:
                  &Anchor = 1
                  Patt = '123'  |  '1234'  |  '23'  |  '341'  |  '412'
                  Test =  ArbNo(Patt)  $  Output  RPOS(0)
                  '123412341223' Test
                Above will produce the following eight output lines
                the first of which is a blank line (for the Null string matched):

                  123
                  123412
                  123412341
                  1234
                  1234123
                  1234123412
                  123412341223
                  
           f. ARBNO is relatively slow and should be avoided if some other pattern
              suffices like:
                  Span(" ")  or
                  Null |  Span(" ")      instead of
                  ArbNo(" ")

        9. Bal
           a. A pattern function that matches initially the shortest non-empty
              substring balanced with respect to parentheses.
           b. Example:
                BalTest = Bal $  Output  Fail
                &Anchor = 0             ;* Default
                "(((A+B*C)*D))"  BalTest
              Above will produce the following:
                 (((A+B*C)*D))
                  ((A+B*C)*D)
                   (A+B*C)
                   (A+B*C)*
                   (A+B*C)*D
                    A
                    A+
                    A+B
                    A+B*
                    A+B*C
                     +
                     +B
                     +B*
                     +B*C
                      B
                      B*
                      B*C
                      *
                      *C
                      C
                      *
                      *D
                      D
       10. Cursor Position Operator (Unary @):  @X
           a. The value of @X is a pattern structure that matches
              the null string and assigns the current cursor position
              as an integer value of the variable X.
              This assignment of the cursor position to the operand
              of the @ operator takes place as immediate value assignment.
              Value is assigned when the cursor position operator is
              encountered during pattern matching, not necessarily following
              successful completion.
           Example:
              &Anchor = 0
              STR = 'TEST$AT$OPERATOR'
              STR  @Head "AT"  @Tail
           In above example, Head will be 5 and Tail will be 7 as they
           each matches the null string at position #5 and #7, respectively.