S T R U C T O R I Z E R - User Guide
Syntax > Specific aspects for ARM export

About the ARM Generator prototype

Under construction

Since version 3.32-02, Structorizer provides (a somewhat premature) prototypical generator for ARM assembler code thanks to Alessandro Simonetta et al.

ARM (assembler) code is a mnemonic representation of machine code for ARM processors, such that the abstraction level is completely different from that of higher-level programming languages like Pascal, C, Java, etc. ARM code generation from an arbitrary Nassi Shneiderman diagram would hence require a full compiler (even breaking down floating point arithmetics to sequences of byte and word operations). This cannot be the task of Structorizer on this early stage. Instead the aim is to convert algorithms formulated on the level of RISC processor capabilities from a structogram to ARM assembler code.

Even conceding this, the conversion capabilities of this early prototype are still very limited. This means that there are narrow restrictions for translatable statements and expressions. These are briefly explained below.

  • The set of supported statements is very limited and the syntax may even differ from the Structorizer conventions (see Basic Concepts).
  • Certain variable names will be interpreted directly as machine registers, and there are some addtional keywords or markers for certain machine-oriented aspects.
  • Array definitions differ strongly from the usual conventions in Structorizer (see Arrays).
  • Records and Enumerations are not supported at all by now.

Register mapping

Variable names R0, R1, etc. through R15 and, equivalenty(!), r0, r1, ..., r15 are interpreted as registers of the ARM processor architecture. Other variables will be mapped to registers not explicitly referenced. If more than 16 variables occur in a diagram then the generator will refuse to translate them sensibly (in future it is meant to do a more or less intelligent management in memory). If both the upper-case and the lower case register name of the same register (same number) occur in one diagram (e.g. R5 and r5), then the behaviour is undefined.

<identifierR> thus denotes an identifier as described in Basic Categories where ARM register names are treated in a special way.

<register> denotes one of the register names R0, R1, ..., R15, or r0, r1, ..., r15.

Expression complexity

The manageable complexity of expressions is very low at the moment. Only "flat" expressions using one kind of operator (e.g. addition or multiplication, not both) can usually be processed, no complex nesting is supported, parentheses will be ignored.

Next to the usual assignment operators, the only supported operator symbols (referred to as <operator> below) are:

+, -, *, &, |, and, &&, or, ||

Logical expressions (to be used in Alternatives, While and Repeat loops) may either be atomic or a series of one or more comparisons combined either by and (equivalently: &&) or by or (equivalently: || ), but not both. Do not rely on operator precedence, parentheses will internally be eliminated. Atomic logical expressions may be variables or registers (which are then implicitly tested to be non-0), a negation operator (not or !) may be applied.


  • isNice
  • not R5
  • R4 < 17
  • R0 = 'b' or R1 >= R4 or R6 = 0x2e4

To keep things simple, we will introduce a combined literal concept <int_literal> here, which is ether a decimal <literal_int> or a hexadecimal literal <literal_hex> (see Basic Concepts):

<int_literal> ::= <literal_int> | <literal_hex>


Basic assignment

The basic assignments allow just Boolean literals, integral literals, variables or a single operation between two simple terms.

<identifierR> ( <- | := ) (true | false)

<identifierR> ( <- | := ) ( <identifierR> | <int_literal> ) [ <operator> ( <identifierR> | <int_literal> ) ]


  • test false
  • count R3
  • R4 0x6 + count

Memory read and memory write operations

<identifierR> ( <- | := ) (memory | memoria) '[' <identifierR> [ + <int_literal> ] ']'

(memory | memoria) '[' <identifierR> [ + <int_literal> ] ']' ( <- | := ) <identifierR>


  • R6 memoria[height]
  • R2 memory[R3 + 0x12]

  • memory[R3]R8
  • memoria[count + 4] r2

Address assignment

Assigns the address of some variable held in storage to a register. The right-hand side of he assignment resembles the call of a built-in function.

<register> ← (address | indirizzo) '(' <identifierR> ')'


  • R5 ← address(storage)
  • R2 ← indirizzo(count)

Character assignment and String initialization

Assigns a character or string literal (as content of which only letters, digits, and underscores are allowed):

<identifierR> <- ' (_|<letter>|<digit>) '

<identifierR> <- " (_|<letter>|<digit>){_|<letter>|<digit>} "

If the literal contains only a single character between the delimiters (quotes) then the assignment is deemd to be a character initialization rather than a string initialization.

Note: By now the assignment of an empty string literal is not supported.


  • R4 'r'
  • number "3"
  • R9 "these_are_4_silly_words"

Array support

The support of arrays is still lacking, i.e. it does not work properly.

Arrays are first to be initialized by a statement of the following form, which is not compatible with usual Structorizer syntax (i.e. they are to be "declared" over a specific low-level data type as if a scalar variable were to be declared):

(word | hword | byte | octa | quad) <identifierR> ( <- | := ) '{' <int_literal> { , <int_literal> } '}'

Example: word array1 {56, 7, 98}

If <identifierR> designates a register, then the register will automatically be associated with the address of the array.

Then assignments of the subsequent kinds are meant to be accepted (though the produced code is still inconsistent):

Read from an array:

<identifierR> ( <- | := ) <identifierR> '[' (<identifierR> | <int_literal>) ']'

Example: c array1[R5]

Write to an array:

<identifierR> '[' (<identifierR> | <int_literal>) [ + (<identifierR> | <int_literal>) ] ']' ( <- | := ) <identifierR>

Example: array1[R2 + 7] ← R4