3. Language Reference Manual

3.1. Syntax Notations

In this section, we define types or identifiers in regular expression. The following notations are used in this document to show lexical and syntactic rules.

  • Dash - is a shorthand for writing continuous elements.
  • Brackets [ ] enclose optional items and select exact one of them. If there is a caret ^ in the [ ^ ], it selects exact one of character that not belongs to the following list, for example [ ^a-z] means any character other than a to z.
  • Parenthesis ( ) enclose alternate item choices, which are separated from each other by vertical bars |.
  • Asterisks * indicate items to be repeated zero or more times.
  • Question mark ? is a sign of option.
  • Double colon with an equal sign ::= is used for definition.
  • Braces {n} matches when the preceding character, or character range, occurs n times exactly.
  • {n,m} matches when the preceding character occurs at least n times but not more than m times, for example, ba{2,3}b will match baab and baaab but not bab or baaaab. Values are enclosed in braces.

Below we will write Yo formal syntax definition in code box. The terminals are marked in bold while non-terminals are in regular font.

3.2. Lexical Conventions

This chapter presents the lexical conventions of Yo. This section describes which tokens are valid, including the naming convention of identifiers, reserved keywords, operators, separators and whitespaces.

3.2.1. Comments

Single line comment is made with a leading # in the line:

# This is a single line comment

Multi-line comment starts with #( and ended with #)

#( This is a multiline
comment #)

Note

Nested comments are not allowed in Yo.

3.2.2. Identifiers

An identifier of Yo is a case-sensitive string different from any reserved words (see next chapter). It starts with a letter or an underscore, optionally followed by a series of characters (letter, underscorce, number). The length varies from 1 to 256.

Formally, an identifier can be any non-reserved word expressed in regular expression as:

Identifier ::= [a-zA-Z_][a-zA-Z0-9_]{0,255}

Tip

_number _number1 number2 number_3 Number

Error

2num *num func $2 Int Double Bool

Note

Int, Double, Bool are illegal because they are keywords. A list of keyword can be found in reserved words.

3.2.3. Reserved Words

This is a list of reserved words in Yo. Since they are used by the language, these words are not available for naming variable or functions. The reserved words are consistent of keywords, built-in-type words and special constants.

  • Keywords: break, continue, for, while, if, else, eval, func, global, in, struct, return.
  • Built-in types: Bool, Int, Double, log.
  • Constants: true, false.

3.2.4. Operators

An operator is a special token that performs an operation, such as addition or subtraction, on either one, two, or three operands. A full coverage of operators can be found in a later chapter, See chapter Expression and Operators

3.2.5. Separators

A separator separates tokens. White space (see next section) is a separator, but it is not a token. The other separators are all single-character tokens themselves: ( ), [ ], ,.

3.2.6. New Line

A physical line ends with an explicit \n input from the user while a logical line contains a complete statement. A logical line can be consist of multiple physical lines, all except the last one ending with an explicit \.

line 1 \
        line 1 continued \
line 1 last line

3.2.7. Whitespace

Whitespace characters such as tab and space are generally used to separate tokens. But Yo is not a free-format language, which means in some cases, the position and number of whitespaces matters to the code interpretation. Leading tab whitespace is used to denote code blocks and to compute the code hierarchy (similar to curly brackets in C-family languages). Briefly, an extra leading tab lowers the level of this line in the code hierarchy.

In contrast to Python, Yo only accepts tabs \t for leading indent, and space is not allowed. In other words, space should not appear at the beginning of any line (except for a continuing physical line where all the leading whitespaces are ignored).

im_a_parent
        im_a_child
                im_a_grandchild
        im_another_child
                im_a_grandchild

Usually, for, while, if, else and function definition may start a new code block. The code block ends with an un-indent. In the above example im_a_child and im_another_child are at the same code indention level.

3.3. Types

Yo is a statically and strongly typed programming language, which means the type for each variable, expression or function is determined at compile time and remain unchanged throughout the program.

Yo has an object-oriented model in which every value is an object and each operation is a method call. We have a pure and uniform object model in the sense that the traditional primitive values (integers, double-precision floating numbers) and functions are incorporated into the object model.

we will show the definition of type and list the built-in types which can be used as building blocks for the user-defined types. Type in Yo is a blueprint for objects, which resembles the concept of class in other languages such as C++, Java and Python.

Note

The concept of type in Yo resembles the class in other languages such as C++, Java and Python, which serves as the blueprint for objects. There are three kinds of types: value types, function types and the None type. For the sake of definition, we will mention function in this section, but the details will be covered in later sections.

In this section, we first list some built-in types as an introduction to our type system. Then we give the formal definition of type and show how users define types in their program.

3.3.1. Built-in Types

Below we list the built-in types in Yo. As they are used as the building blocks for the program, Yo provides literals to initialize them conveniently in users’ source code. The operators on this types are covered in next section.

  • Int 32-bit signed integral number, ranging from \(-2^31\) to \(2^31 - 1\). The literal has to be represented in decimal:

    IntLiteral ::= [0-9]+

    Note

    A compile error will be generated if the Int literal exceeds range defined above.

    Yo does not support the leading positive/negative sign (because in most cases, negative number would not be used). But user can still create negative numbers by subtracting from zero,for example 0-5, which is just``-5``.

  • Double 64-bit double-precision floating number. The literal is represented as follows:

    DoubleLiteral ::= [0-9]*.[0-9]+

    Note

    Note that the dot and the fractional number is compulsory (otherwise it can be identified as Int. For example,``32.45 .5`` are of valid Double type.

  • Bool Binary value of either true or false

    BoolLiteral ::= true | false

  • String A contiguous set of characters. The literal has zero or more characters enclosed in double quotes. A character can be a regular character or an escape sequence.

    StringLiteral ::= “StringCharacter

    StringCharacter ::= [ˆ”\’] StringCharacter

    | [ˆ”\’]

    | EscapeSequence StringCharacter

    | EscapeSequence

    EscapeSequence ::= \b | \t | \n | \r | \” | \’ | \ \

    Escape sequences are listed in Table below.

    Escape Sequence Meaning
    \b Backspace
    \t Horizontal Tab
    \n New Line
    \r Carriage Return
    \” Double Quote
    \’ Single Quote
    \ \ Backslash

    Examples of valid String: "abc", "9j32 f0kca0", "Hello\nYo!"