The Analyser Preferences Dialog
The Analyser Preferences dialog is opened by means of the Preferences menu:
The Analyser is an advanced feature, which steadfastly analyses the structogram against different rules that structograms should comply with and checks it for obvious inconsistencies (like loops where the body has no impact on the condition and hence may unwillingly form an eternal loop).
The Analyser rules available for configuration are presented in a multi-tab dialog. It roughly categorizes the rules into
- essential algorithmic tests;
- general syntax checks (since version 3.32-01);
- checks concerning identifier naming and code style conventions; and
- hints and tutoring (see Guided Tours / Tutoring):
Since version 3.30-14, the elements related to one or more warnings in the Analyser report, will by default be marked with a small red triangle in the upper left corner (see example screenshots further below) in order to draw the user's attention to the warnings in the report list. The introduction of this feature was accompanied by the new checkbox at the bottom of the Analyser Preferences dialog, saying "Draw warning sign in affected elements" (see screenshot above). By unselecting this checkbox you may switch off this indication. (It will also be suppressed by disabling Analyser, of course.)
Among the convention rules there are also several ones that have been specially designed for Luxemburgish students. In Luxemburgish schools these rules are mandatory. The most special ones of this kind are marked with "(LUX/MEN)". So you may opt them out if you haven't to obey these rules.
Actually, each rule can be enabled or disabled independently. The analyser itself can be activated or disabled as a whole via the "View" menu (see Settings > Analyse structogram?) or by pressing the <F3> key.
The analyser strongly relies on the Parser Preferences. If, syntactically seen, you don't stick very close to them, the analyser will not work correctly but probably produce a lot of needless warnings.
Note: As it is proven that a program can never absolutely predict the behaviour of another program, the messages produced by the analyser should at best be considered as hints. They might be misleading or even wrong in certain cases!
The fourth tab is dedicated to some smoothly guided tours. Two prototypes of them are available since version 3.27-02. Further ones are likely to be added:
Rule Type Explanation
(Checkbox order may change, this list follows the one presented by version 3.25-07, see screenshots above.)
Instructions
- Check for non-initialized variables.
See Analyser for examples. This check reports all names occurring in expressions and adhering to strict identifier syntax (see "Ceck for valid identifiers" below), which have not been initialized in a preceding instruction. This includes all names for which there is no initialisation at all. This means if a variable was used prior to its first assignment. Varisables initialized only in some branches of an alternative or CASE statement are also reported as potentially uninitialized.
- Check for assignment errors.
This analysis detects instruction lines containing an equality comparison operator but no assignment operator. Since in many languages (including C, Java, PHP, etc.) the single equality sign '=' is used as assignment operator, unfortunately, one easily writes it here as well, instead of := or <-, so the check will warn in such cases. The check doesn't make sense, therefore, if you willingly fill in source code complying with the syntax of such a programming language (see No conversion of the expression/instruction contents in Export Options, by the way) and aren't interested in language-independent executability.
- Check for possible violations of constants.
If enabled, Analyser will complain on any attempt to redefine or modify the value of a defined constant. Examples:
- Check type definitions.
This analysis checks e.g. whether type definitions are syntactically incorrect or duplicate or contradicting or if variables don't adhere to their defined structure or e.g. aren't declared but used as records (chiefly in case of record variables):
Conditions and Alternatives
- Check for assignment in conditions
Though legal in languages like C, a condition test shouldn't have side-effects in structured programming, in particular, there should not be an assignment operator in conditions of loops, alternatives or CASE statements. (Executor would handle it as an error, anyway.) The legality of assignments in C conditions is a common source for bugs (accidently an assignment operator = is used instead of the intended == comparison operator). Note that Executor does not allow to execute a call to another subroutine diagram in a condition (which would also raise the risk of a clandestine value change).
- Check for incorrect use of the IF-statement
If only one branch of an IF statement is needed then the "TRUE" branch (i.e. the left branch) is to be used. (This is quite easy to achieve by negating the condition if necessary, you might use the magic wand button to flip an alternative.) Analyser reports IF statements where the "TRUE" branch is empty no matter if the "FALSE" branch contains instructions.
- Check that CASE choice value is not of a structured type (versions ≥ 3.30-16)
The detection of the applicable case in a CASE element relies on a comparison among discrete values of some primitive type, many programming languages even require them to be integral values or characters (enumerator types are perfect). Structured types (i.e. arrays or records/structs), however, do not make sense in nearly any case. This option will activate warnings if the choice expression doubtlessly represents a structured type.
- Check that CASE selector items are integer constants (versions ≥ 3.30-02)
Though Structorizer supports even variables and string literals as selector values for the branches of CASE elements, some target programming languages for code export (e.g. C, C++, C#, Java etc.) may not so, they may only accept integral constants (including character literals and enumerator type values). This analyser option will check that all selector items are integral constants.
- Check that CASE selector lists are disjoint (versions ≥ 3.30-02)
If a selector value occurs in several branch labels of a CASE element or even if it occurs more than once in the selector list for one branch then this check will produce an entry in the report list. Though Structorizer and many programming languages simply resolve branching conflicts by going to the first branch with a matching selector, Analyser will report selector values occurring more than once in the branch labels of a CASE element with this check enabled, as it usually signals a design mistake.
Loops
- Check for modified loop variable
The manipulation of the counter variable of a For loop by the loop body is regarded as a no-go (though often seen in C or Java code), some programming languages (like Pascal) do explicitly prohibit this interference with the loop control mechanism. Moreover, syntax errors like too few (i.e. none) or too many counter variables in the loop header are reported if this option is chosen.
- Check for consistency of FOR loop parameters
A For loop is a conveniently combined loop, which may adhere to one of two different types (counting loop / traversing loop). A dedicated editing support for the header is provided (see there), resulting in some possible redundancy between the specific entry fields and the full text. Whereas the element editor tries to synchronize the information according to the detected type, loading a diagram that had been created under different pereference settings may not fit into the consistency requirements. This Analyser check detects logical differences or even conflicts among the representations. In case of a counting loop it further generates a warning if step value is configured not beeing a legal no-zero integer constant. It also detects if a variable name collides with a configured FOR loop parser keyword.
- Check for endless loop (as far as detectable!)
Unlike intentionally inserted endless loops, an algorithm must be able to leave a loop eventually. While a FOR loop has a counting mechanism where the loop body should not interfere, an impact of the loop body on the loop condition of WHILE and REPEAT loops is necessary. Hence, if the loop body does not change the value of any of the variables refered to by the loop condition, the Analyser will assume that the algorithm probably fails to get out of the loop.
Functions/Procedures and Calls
- Check that a subroutine header has a parameter list
Subroutines (functions / procedures) do some subordinate work within a program. Some subroutines just execute a fixed algorithm without alteration, but usually subroutine calls apply an algorithm to different sets of appropriate data (e.g. in order to compute the medium value of some array of numbers) where these data (e.g. the respective array) are to be passed to the subroutine as parameters. The list of parameter names is expected to be enclosed by parentheses and immediately to follow the subroutine name. This check ensures that such a parenthesized parameter list is present. It may be empty (if the subroutine doesn't need arguments), but at least the parentheses should be there. A main program header, however, may come without parameter list. So this check only applies to subroutine diagrams. (Structorizer may tolerate a subroutine diagram with missing parameter list and handle it as if it had an empty parameter list, but programming languages might regard a missing parameter list as syntax error.)
- Check if, in case of a function, it returns a result
A diagram of subroutine type will often represent a function, i.e. a mapping of input data to result data, the latter of which are to be returned to the calling program level. If this check is enabled then it analyses whether or not the routine will provide such a result value in any case (i.e. no matter which path through the algorithm is taken). Be aware that Structorizer supports several ways to provide a result value:
- by using a return instruction,
- by assigning the value to a variable named "result" or "RESULT",
- by assigning the value to a variable named after the routine.
Therefore it is also checked if the function happens to employ more than one of these three mechanisms, which causes ambiguity.
- Check for inappropriate subroutine CALLs and missing call targets.
The content of a CALL element should either be a bare external procedure call or a simple assignment instruction with a bare external function call as expression (where "bare" means that there is no further expression around). The procedure or function name must be followed by an argument list, which is — similar to the parameter list in a subroutine header — to be enclosed between parentheses (but may be empty). The arguments may be complex expressions (but are supposed not to contain external procedure or function calls themselves). The analysis here checks whether some of these restrictions are violated. Neither Executor nor code export would accept a CALL that doesn't stick to the prescribed syntax. If the syntax is verified then Analyser also checks whether the called subroutine is currently available.
Jumps and Parallel Sections
- Check for incorrect EXIT element usage
Any of the following issues are reported:
- EXIT elements, which are neither empty nor start with "leave", "return", "exit", or "throw" keyword (or what's configured for them in the Parser Preferences);
- Return instructions that are situated neither at diagram end nor in an EXIT element;
- Instructions starting with an "exit" or "leave" ("break") keyword outside of an EXIT element;
- leave/break instructions outside a loop or specifying more levels to leave than being nested in;
- return instructions in a branch of a PARALLEL section;
- exit or leave instructions with illegal parameters (only integer constants are allowed);
- Instructions directly following an EXIT element of arbitrary type (unreachable).
- Check for inconsistency risks in PARALLEL sections
If a variable being subject to modification in one of the threads of a PARALLEL section is also used in concurrent threads of the same PARALLEL section then this is reported as a potential hazard.
General Syntax
- Check that brackets are balanced and correctly nested (versions ≥ 3.32-01)
Induces a warning if the number of opening (i.e. left) parentheses, brackets, or braces does not match the number of closing (i.e. right) ones in expressions and instructions. Likewise the correct correspondence of left and right brackets of the appropriate type, regarding recursive nesting, is analysed and glitches are signalled.
- Check variable declaration and initialisation syntax (versions ≥ 3.32-13)
Checks that variable declarations in Pascal/Basic style introduce new identifiers, that the associated type specification is correct (either a known type name or an array construction over a named type), that only a single variable identifier is introduced with an initialisation, that a declaration in C/Java style is combined with an initialisation, e.g:
- Check compliance with ARM-specific syntax conventions (versions ≥ 3.32-05)
This option enforces a very draconic grammar check on all elements, which is related to the ARM code generator. Representing a RISC architecture, ARM processors provide a reduced but highly regular instruction set. Structorizer might be used to design algorithms in the spirit of this philosophy on a low level. To support such an approach, a grammar was defined that reflects as close as possible the ARM instruction capabilities (an exact equivalence is not achievable, though, cf. Special syntax for ARM). With this check option enabled, there will be warning on element content that is more complex than the machine level could directly interpret (more exactly: that is not complying with the imposed grammar rules). No matter whether ARM generator may actually provide e.g. a compilation of a hierarchic mathematical expression, the restricted check will only accept simple operations between two operands, of which the first one may be a variable or register name and the second one either a variable or register name or a literal. This may or may not be combined with a related ARM-generator-specific export option to reject non-elementary content.
Identifiers and Naming Conventions
- Check for valid identifiers
Induces a warning if a character sequence introduced as variable or routine name does not adhere to strict identifier syntax: only consisting of ASCII letters (of the English alphabet), digits, and underscores, not starting with a digit. See below for even more restrictive naming conventions.
- Check that the program / sub name is not equal to any other identifier.
Interestingly, in languages like Pascal a value assignment to a variable named after the subroutine itself is the official way to prepare the value return (and Structorizer supports this behaviour, too). If recursion is allowed (which should be in high-level languages), it must of course be possible to refer to the same name within a nested recursive function CALL. In other contexts, however, the occurrence of the program / subroutine name in some expressions may be a sign of a potential bug. That's where this option makes sense.
- Check that identifiers don't differ only by upper/lower case.
Many programming languages (like C, Java, Oberon etc.) dinstinguish upper-case and lower-case letters in identifiers of variables, procedures etc. So does the Executor. Other languages (like Pascal) don't. To use names like bad, Bad, BAD, bAd, baD etc. in the same diagram is therefore not only bad style but also a limiting factor for the range of export languages, because names meant to be synonyms might be distinguished or — the other way round — names thought of being distinct could be regarded as identical — both corrupting the algorithm. This check, being activated, will report every introduction of a new variable that only differs in letter case from others, previously assigned ones.
- Checks if an identifier might collide with reserved words.
Most programming languages use a set of reserved words — designating algorithm structures, data structures, or primitive data types. If you name a variable like one of these reserved words then the result of a code export to the respective language will cause trouble. Based on lists of important reserved words of all programming languages a code generator is plugged in for, this check will point out all instructions introducing a variable name with potential keyword collisions (and will list the languages known to use this name as reserved word).
- Discourage use of mistakable variable names «I», «l», and «O»
Letters "I" (upper-case i) and "l" (lower-case L) are very hard to distinguish in many fonts, moreover they may resemble the digit 1 in other fonts. The same holds for letter "O" (upper-case o), which is easily mistakable with the number 0. Is it already questionable to use single-letter identifiers at all (except within limited scope or for well-accepted concepts like coordinate names x or y), then the use of one-letter names "I", "l", and "O" as variable identifiers is a really bad idea (an absolute no-go, actually). With this check enabled, the analyser will express a respective warning wherever one of these three error-prone variable identifiers is introduced.
- Check for UPPERCASE variable names. (LUX/MEN)
In Luxembourg, the Ministry of Education prescribed that at public schools variable names be written in UPPERCASE. (Elsewhere this may not be a wanted code style demand.)
- Check for UPPERCASE program / sub name. (LUX/MEN)
In Luxembourg, the Ministry of Education prescribed that at public schools program and subroutine names be written in UPPERCASE. (Elsewhere this may not be a wanted code style demand.)
- Check for standardized parameter name. (LUX/MEN)
In Luxembourg, the Ministry of Education prescribed that at public schools parameter names be not only written in UPPERCASE have to be prefixed by a lower-case 'p' letter, such that "pSOMETHING" would be a legal parameter name whereas "SOMETHING" wouldn't. (Elsewhere this may not be a wanted code style demand.)
- Check for mixed-type multiple-line instructions.
Structorizer copes with Instruction elements that contain several lines. However, if you regard an Instruction element that contains both input and output statements or one of them together with assignments as violation of NSD principles, then this check will find such unwanted elements. By the way, you may easily convert a multi-line Instruction element into a sequence of single-line Instruction elements by means of the Transmutation button .
Guided Tours / Tutoring
Rather than introducing some dedicated assistants or dialogs, the guided tours just steer you smoothly via recommending hints in the Analyser report list. Ideally you start with an empty diagram (<Ctrl><N>). Then you will see some messages in the lower pane, one of them is a note informing you that a guide is switched on and how to switch it off. A little blue marker triangle will also remind you that there are tutorial hints in the report list (from version 3.30-14 on; among the tutorial hints there may be regular Analyser warnings, in which case the marker will be red):
- Short "Hello World" tour.
Guides through the creation and use of a "hello world" program.
- Guide to first program instructions (IPO model).
Like the Short "Hello World" tour, the insertion of an input instruction, an output instruction, and a processing instruction between them is recommended, but it already offers more freedom of choice. (Btw. IPO means "Input — Processing — Output", a fundamental and traditional division of a program into the three phases of data acquision, data processing, and data output. Even with modern interactive applications, it makes sense to separate dialog and processing. Control processes, e.g. in automation and embedded systems, mostly also adhere to this model, often within an eternal loop: measurements / sensor monitoring, control decisions, control action.)
etc.
|