S T R U C T O R I Z E R - User Guide
Syntax > Basic Concepts

Recursive "Definition"

An essential concept is that of an expression. An expression describes a (not necessarily numeric) value or the operations to compute a value from other values by means of functions and operators in a more or less natural way. Obvious examples of expressions are:

4 + 7
r * sin(x)
sqr(a + b)

Below there is a semi-formal recursive syntax introduction, where in some cases sort of extended Backus-Naur form is used as a well-known description format (though not quite exact here). Metasymbols are:

  • angular brackets <, > enclose a non-terminal concept defined by grammar rules,
  • definition symbol ::= separates the concept to be defined from the replacing symbols,
  • vertical bars | separate alternative rules within a combined rule,
  • parentheses (, ) group e.g. alternative syntactic parts within a rule,
  • brackets [, ] enclose optional parts,
  • {, } enclose iterated parts within a rule.

Where the above characters are not meant to be meta-symbols but terminal symbols (actual characters in the defined language) they will be underlined to mark the difference. 

Atomic expressions are literals and identifiers.

Literals

  • The keywords true and false are Boolean literals.
    <literal_bool> ::= true | false
  • A sequence of digits possibly preceded by a sign is an integer literal:
    <digit> ::= 0|1|2|3|4|5|6|7|8|9
    <literal_int> ::= [+|-] <digit> {<digit>}
  • A sequence of digits and letters A...F or a...f after a 0x prefix is a hexadecimal integer literal:
    <hex_digit> := <digit>|A|B|C|D|E|F|a|b|c|d|e|f
    <literal_hex> ::= 0x<hex_digit> {<hex_digit>}
  • An integer literal followed by an 'L' character is a long integer literal.
    <literal_long> ::= <literal_int>L
  • A sequence of digits with a decimal point or an exponential postfix is a floating-point literal; the keyword Infinity and the symbol ∞ are also floating-point literals (since version 3.30-15):
    <literal_float> ::= <literal_int> . <digit>{<digit>} [ E [+|-] <digit>{<digit>} ]
         | [+|-] . <digit> {<digit>}[ E [+|-] <digit>{<digit>} ]
         | Infinity | ∞
  • A single printable character (except a single quote) enclosed in apostrophes (i.e., single quotes) is regarded as character literal:
    <literal_char> ::= '<character>'
    But well, certain escape notations as e.g. '\n', '\t', '\0', '\'', and '\u0123' are also valid character literals.
  • Other character sequences enclosed in single or double quotes are string literals (if it's not a character literal); a string literal may also contain certain escape sequences (in particular, the enclosing delimiter, i.e. a single or double quote, respectively, must be escaped with preceding backslash if occurring within the string literal content).
    <literal_string> ::= " {<character>} "  |  ' {<character>} '
  • Integer, long, and floating-point literals may together be qualified as numerical literals:
    <literal_num> ::= <literal_int> | <literal_hex> | <literal_long> | <literal_float>

Examples:

  • true is a Boolean literal, meaning the logical value TRUE.
  • -12 is an integral meaning the obvious value of minus twelve.
  • 12.97 and -6.087e4 are non-integral (floating-point) numeric literals.
  • '9' is not a numeric but a character literal.
  • "Attention!" and 'more than 1 character' are string literals.
  • "He called me \"moron\" when I left." and 'is"ok' are valid string literals, "oh"no" ist not.
  • 'a' is a character literal whereas "a" is a string literal.
  • ∞ is a floating-point (double) literal, meaning an infinite positive value (same as Infinity).

Identifiers

An identifier is a name for certain concepts. In contrast to literals, identifiers require some user-specific declaration or introduction that associate them with a storage place, a value, or e.g. type.

  • A contiguous sequence of ASCII letters, digits and underscores, ideally beginning with a letter, not at least beginning with a digit, is an identifier:
    <letter> ::= A|B|C|...|Z|a|b|c|...|z
    <identifier> ::= (_|<letter>){_|<letter>|<digit>}

Examples:

  • kill_bill is an identifier, off the record is not (there must not be spaces within).
  • Infinity, true, and false are not regarded as identifiers, because they are reserved as literal keywords.

Expressions

  • Literals (see above) are (atomic) expressions.
  • Variable Designators are (atomic) expressions specifying a storage location associated to a variable or some structural part of it, such as an element of an array variable or a component of a compound (record/struct) variable. Since data structures may be nested, variable designators may be a long sequence starting with an identifier (the variable name), followed by many accessors (bracket-enclosed index lists or dot-linked component selectors). Semantically, a variable designator is only valid if the sequence of accessors corresponds to the nested structure of the variable, specified e.g. by a variable declaration (usually related to type definitions), initialisation, or assignment.
    <var_desig> ::= <identifier> | <var_desig> <accessor>
    <accessor> ::= . <identifier> |
    [ <int_expr_list> ]
    <int_expr_list> ::= <int_expr> { , <int_expr> }
    An <int_expr> is just an <expression> the value of which has to be an integer.
    Examples:
        person
        today.month
        readings[k]
        staff[i+5].date_of_birth.year
        chess.board[row, column]
        matrix[i][j][k]
  • A function call is an atomic expression. It is formed by an identifier followed by a pair of parentheses, which enclose a (possibly empty) comma-separated list of expressions, may be a function call. Note: Though procedure calls look quite the same (see below), functions return a value when called, whereas procedures don't. Hence, procedure calls are not expressions but statements (see below).
    You find a list of provided built-in functions and procedures in the User Guide.
    <func_call> ::= <identifier> ( <expression_list> )
    <expression_list> ::=  | <expression_list> , <expression>
    Examples:
       sin(alpha)
       copy("comparsery", 5, 4)
  • Expressions joined by suited operator symbols are expressions. You find a table of accepted operator symbols in the Structorizer User Guide. The following recursive rules reflect operator precedence.
    <factor> ::= [ + | - ] <atomic_expression>
    <not_expr> ::= (not | !) <atomic_expression>
    <mult_expr> ::= <factor> | <mult_expr> ( * | / | div | mod | % ) <factor>
    <add_expr> ::= <mult_expr> | <add_expr> ( + | - ) <mult_expr>
    <log_expr> ::= <add_expr> ( = | == | <> | < | > | <= | >= ) <add_expr> | <not_expr>
    <log_and_expr> ::= <not_expr> | <log_and_expr> ( and | && ) <log_expr>
    <log_expression> ::= <log_and_expr> | <log_expression> ( or | || | xor ) <log_expression>
    <cond_expr> ::= <log_expression> ? <expression> : <cond_expr>
  • An expression enclosed in parentheses is an atomic expression.
    Examples:
        (23.5)
        (a[i])
        (23 * (17 + length("some text")))
  • A comma-separated list of expressions as described above, enclosed in braces, is an array-initializing expression (only usable in assignments, as routine arguments, as input, and in FOR-IN loops). The list may be empty.
    <init_expr_a> ::= { <expression_list> }
    Examples:
        {2, 3, 5, 7, 11, 13, 17}
        {}
        {"Fruit", "flies", "like", "banana"}
        {17.3*4, sqrt(2.9), pow(1.2, k), log(val)}
  • A defined record type identifier, followed by a pair of braces that include comma-separated triples of a declared component identifier, a colon, and an expression is a record-initializing expression.
    <init_expr_r> ::= <identifier> { <comp_init_list> }
    <comp_init_list> ::=  | <comp_init_list> , <comp_init>
    <comp_init> ::= <identifier> : <expression>
    Examples:
        Date{2023, 10, 14}
        Employee{"Dough", "John", Date{1995, 12, 24}, HEAD_OF_DPT}
        UnitValue{130, KMPH}
    • Since version 3.28-06, a smart record-initializing expression is supported. It still starts with a defined record type identifier, but the following pair of braces may contain a mere list of expressions as in the array-initializing expression. In this case the values will be assigned to the components in order of their declaration in the respective record type definition. There must not be more expressions in the list than components in the type (but there might be less). It is also allowed that an incomplete expression list is followed by a sequence of comma-separated triples of name, colon, and expression as before. Simple expressions following a component initialization triple, however, are ignored (examples see Records).
      <comp_init> ::= [ <identifier> : ] <expression>

Summary:

<atomic_expression> ::= <literal> | <var_desig> | <func_call> | ( <expression> )

<expression> ::= <add_expression> | <log_expression> | <init_expr_a> | <init_expr_r> | <cond_expr>

  • There are no other expressions.
  • The type of an expression is derived from the used operators and operand expressions and functions. (The incorrect BNF snippets above were simply to give a vague idea how type deduction might work in a grammar-defined way. To provide a halfway exact parsable grammar would require much more non-terminal vocabulary and hundreds of BNF rules with the major weak point of undeclared variables.)
  • A Boolean expression may be constructed with comparison operators or consist of operands with Boolean value etc.
  • On execution, the syntax is context-sensitive, i.e. the actual variable and constant types decide whether the expression is well-formed and can be evaluated. But then its result type is unambiguous.
    Consider the following diagram. Looks pretty simple and straightforward, right? Entered 5 and 7, the result will be 12, okay. But wait — what if the user enters an array or record initializer? Then the yellow expression would be completely illegal! If one of the inputs is a string then variable c would be a string, otherwise with one of a or b being false or true illegal again, with two numeric values c would become a floating point number if a or b had been entered with a decimal point, else possibly an integer result. And so on.
    Simple   diagram with impossible type prediction

Statements

Statements describe some executable action. In many traditional programming languages, statements are no kind of expression, neither they are in Structorizer. They may contain and use expressions. Elements of Nassi-Shneiderman diagrams represent statements, not expressions. They may be simple (atomic: Instruction, Call, or Jump elements) or structured (i.e., they contain nested elements, any remaining kind of element).
  • An assigment is a statement (see Instruction in the user guide)
    <assignment> ::= <var_desig> ( <- | := ) <expression>
    In some programming languages (like C), assignments are expressions themselves and may thus be used as terms in more complex expressions, this does not hold in Structorizer, though.
  • A procedure call is a statement (instruction). Depending on whether the referenced procedure is a built-in one or refers to a user-defined subroutine diagram, either an Instruction element or a Call element is required to place the procedure call.
    <proc_call> ::= <identifier> ( <expression_list> )
  • Further statements are:
    • input statement
      <input_statement> ::= <input> [ <literal_string> [,] ] [ <var_desig> { , <var_desig> } ]
    • output statement
      <output_statement> ::= <output> <expression_list>
  • In a wide interpretation (e.g. C etc.), type definitions, constant definitions, and variable declarations might also be regarded as statements. In a stricter sense (e.g. Pascal), they are not. Structorizer places them in Instruction elements, so they might be subsumed under the concept "statement" here.
    • constant definition
      <const_definition> ::= const <identifier> ( <- | := ) <const_expression>
      <const_expression> is just an expression the value of which only depends on literals and defined constants.
    • type definition
      <type_definition> ::= type <identifier> = ( <record_spec> | <enum_spec> | <array_spec> | <identifier> )
      <record_spec> ::= ( record | struct ) { <comp_decl> { ; <comp_decl } }
      <array_spec> ::= array <dim_ranges> of (<array_spec> | <identifier>) | <identifier> <dim_sizes>
      <enum_spec> ::= enum { <enum_item> { , <enum_item> } }
      <comp_decl> ::= <identifier> { , <identifier> } : ( <array_spec> | <identifier> )
      <enum_item> ::= <identifier> [ = <const_expression> ]
      <dim_ranges> ::=  | [ <dim_range> { , <dim_range> } ]
      <dim_range> ::= <literal_int> .. <literal_int> | <const_expression>
      <dim_sizes> := [ <const_expression> { , <const_expression> } ]
    • variable declaration may be a mere declaration or an initialised declaration. In the latter case (and only in the latter case) either Pascal-/BASIC-like or C-like style is supported. For a mere declaration (i.e. without initial value assignment), only Pascal/BASIC style is available.
      <variable_declaration> ::= <var_decl> | <var_init1> | <var_init_c>
      <var_decl> ::= ( var | dim ) <identifier> { , <identifier> } ( : | as ) (<array_spec> | <identifier>)
    • initialised variable declaration
      <var_init1> ::= ( var | dim ) <identifier> ( : | as ) (<array_spec> | <identifier>) ( <- | := ) <expression>
      <var_init_c> ::= <identifier> <identifier> [ <dim_sizes> ] ( <- | := ) <expression>
  • Return, leave, exit, and throw statements are Jump statements, represented by a specific type of element in structograms.
    <return_stmt> ::= return [ <expression> ]
    <leave_stmt> ::= leave [ <literal_int> ]
    <exit_stmt> ::= exit <add_expr>
    <throw_stmt> ::= throw [ <add_expr> ]

    <jump_statement> ::= <return_stmt> | <leave_stmt> | <exit_stmt> | <throw_stmt> 
<statement> ::= <assignment> | <proc_call> | <input_statement> | <output_statement> | <jump_statement> | <const_definition> | <type_definition> | <var_declaration>
  • Any composed statement (i.e., a basic algorithmic structure like an alternative or a loop) is represented by a specific kind of structogram element and doesn't need a syntax explanation therefore, with the exception of FOR loops.
    <for_loop_header> ::= <for> <identifier> ( <- | := ) <add_expr> <to> <add_expr> [ <by> <literal_int> ]
    <for_loop_header> ::= <foreach> <identifier> <in> <list_expr>
    <list_expr> ::= <qual_name> | <init_expr_a> | <expression_list>