1.2 Symbols and Expressions

We now start a systematic definition of Refal. A summary of the language's syntax is also appended as Reference Section B.

A symbol in Refal is the minimal syntactic element of data structures. We use the following four kinds of symbols:

  • symbolic names, also referred to as identifiers;
  • characters;
  • non-negative whole numbers (macrodigits);
  • real numbers.

An identifier is a sequence of characters which starts with a capital letter and can include letters, digits, hyphens - and underlinings _ . The case of letters, except the first one, does not matter; thus SUM1 and Sum1 represent the same identifier. Hyphen and underlinings are interchangeable: Sum-1 and Sum_1are the same. Identifiers must not be broken by blanks or transfers. The total length of an identifier must not exceed 15 characters.

Examples of valid identifiers:


  A    X1    Y2g3h56c    CrocoDile    Hit-and-run
Invalid as identifiers:


  a    x1    16x    Me+You    Input/output
 

All printable characters of the computer system can be used as symbols. To distinguish character-symbols from symbolic names, we put the former in quotes. To represent a string of character-symbols, we put the whole string in quotes. Both single quote ' , and double quote " can be used, but the opening and the closing quotes must be the same. Thus 'A' is one symbol, while 'A+B' is a string (sequence) of three symbols: characters 'A' , '+' , and 'B' . To represent the quote itself, we double it, whether isolated or inside a string limited by the same kind of quote. Thus '*,''' is a string of three symbols: an asterisk, a comma and a single quote. It can also be written as "*,'" .

One-letter identifiers should not be confused with characters: Ais different from 'A' . Characters are case-sensitive: 'A'is distinct from 'a' .

Macrodigits in our implementation of Refal-5 are integers in the range from 0 to 232-1. Greater numbers can be composed from macrodigits using the base 232, as decimal numbers are composed from the ordinary (decimal) digits; this explains the name. To represent negative whole numbers we put '-' in front of digits, as we do when using decimal digits. Like letter-characters, which are different from letter-identifiers, digit-characters are different from number-symbols. '1' is not the same as 1 . While the former is an ordinary decimal digit, the latter is a macrodigit.

Examples: 3306 is one symbol -- a macrodigit with the numerical value of 3306 . '-'25 is a sequence of two symbols; the character '-' followed by the macrodigit 25 . Together they will be taken by arithmetic functions as the number -25. (In programs all characters must be quoted; when reading data by function Input [see Sec.2.3] the quotes are not necessary.) The following:


2543 88918 9
is a sequence of three macrodigits which will be understood as

2543*264 + 88918*232 + 9

NOTE: If you write something like '--'25 this will not be a syntax error. This is a quite legitimate string of three symbols. An error will ensue if you try to use it as an operand in an arithmetic function.

A real number in Refal-5 is always represented by one symbol, and takes a computer word. A real number may be positive, zero, or negative. In a program, real numbers are written in the usual form with a period separating the fractal part from the whole and E separating the decimal order. Examples:


  215.73
  -18E+15
  0.003E-7
(see Reference Section B for a more detailed syntax). Symbols representing real numbers are distinct from macrodigits: 215 is not the same as 215.0 .

Generally, blanks are not counted as symbols; blanks and line transfers are used to separate lexical units of Refal whenever necessary and to position them nicely on the page. The only situation where a blank becomes a symbol is when it is used inside quotes. While ''is a quote-character, ' ' is a blank-character. When used between quoted strings, a blank tells us to unquote them separately. Thus 'a' 'b' is a string of two characters, a and b , while 'a''b' is a string of three characters: a , the quote, and b .

To create data structures in Refal we use parentheses. Unquoted parentheses are not symbols but specialsigns of Refal. We also refer to them as structurebrackets in order to distinguish them from the angular evaluation brackets. Structure brackets must be properly paired according to the well-known simple rules. We call any sequence of symbols and parentheses whose parentheses are correctly paired an objectexpression. More precisely, an object expression is a sequence of a finite number of terms, where a term is either a single symbol or an expression enclosed in parentheses. The number of terms in an expression may be zero so that an empty object (just nothing) is a legitimate expression. Here are other examples of expressions:


  A
  (A'+'B)'*'(C'-'D)
  Begin (Ho-ho-ho '(' ('A joke')) End
  ()
  (()'100'100() (()) ) [[
Examples of sequences which are not expressions:


  ) End
  A ( B)((C)
  ( A ')';

NOTE: Quoting every character as in one of the examples above:


  (A'+'B)'*'(C'-'D)
may be a nuisance. One must bear in mind, however, that this is necessary only in the text of a Refal program where the appearance of big expressions of this kind is rather unlikely. This expression is typical data. When it is typed in or read from a file as data, using the special function Input , the quotes are necessary only around character-parentheses in order to distinguish them from structure brackets. We would thus type in:


  (A + B) * (C - D)
 

In algebra we use expressions to represent certain sequences of operations over numbers and variables; parentheses in expressions indicate the order in which to perform operations. If the expression above is understood as an algebraic expression, it would be represented by the tree structure shown in Fig. 1.1.

(A + B) * (B - C) 
Figure 1.1 The tree for the algebraic expression:

It should be stressed that the concept of expression in Refal is more general. We do not assume any special interpretation of expressions; they can be used in various ways. A Refal expression is simply a structured object built in a certain way from symbols and structure brackets. For each structure bracket there is exactly one paired bracket of the opposite kind. Together they form a sort of box or pouch. They delimit a sub-system of the overall system, which is a part of the whole, but which still preserves its unity. If you locate one boundary of this sub-structure, the other boundary is uniquely defined. The relation between a system and its sub-systems is a very important aspect of the world. When we create symbolic models of the world, structure brackets model this relation. If the piano, violin, and viola are in Ann's apartment, while the cello and bass are in Bob's, this situation can be modeled by the Refal expression:


  Ann-apt(Piano Violin Viola) Bob-apt(Cello Bass)
 

A Refal expression can be represented by a tree, like an algebraic expression. However, if we treat an expression as just a Refalexpression, without interpreting it in any way, the tree should be somewhat different. In Fig. 1.1 we raised the operations over their arguments. But if we do not interpret the expression, then A + B is just a concatenation of three symbols, and they all should be on the same level. The tree should look as in Fig. 1.2. The leaves (end-nodes) of the tree are symbols; the other nodes are parenthesized subexpressions.

(A + B) * (B - C) 
Figure 1.2 The tree for the Refal expression

Another use of the second dimension in picturing expressions is to connect the paired parentheses by lines, as if by strings; the hierarchy of subexpressions is then made more visible (see Fig. 1.3). When Refal expressions are represented in a computer, the address of the paired parenthesis is stored with each parenthesis, so that it becomes possible to jump from a parenthesis to its complement in one step, as if running along these strings.

(A + B) * (B - C) 
Figure 1.3 Paired parentheses are connected for clarity.




Exercise 1.1 Write the string Joe's Pizza is "cute"using single and double quotes as delimiters.

Exercise 1.2 Write -2 36as a whole number in Refal.