STRUCTORIZER User Guide

Preferences > Import

Preferences menu with item import selected

Menu item "Import ..." in the "Preferences" menu opens a dialog where options for the loading of files of different types may be configured:

Import options dialog

The dialog contains two boxes with options for the loading of

code files (e.g. ANSI-C import),
diagram files (Structorizer's own .nsd format).

What do the options mean:

Character Set (for code file import)
You may control what character encoding to use on import. By default UTF-8 will be used but if you happen to find the import file encoded in a different caracter set then you may select the actual character set of the file in oder to import the code cleanly. While "List all?" isn't checked, the choice list will only offer you about six of the most common character sets. By enabling "List all?", however, the choice will be expanded to several dozens of encodings, i.e. to all your Java version will know of.
Note: For the moment this setting is of very limited use for files coded with an 8-bit-based character set, though, since the applied Pascal grammar doesn't cope with non-ASCII characters (apparently due to a parser bug) such that they have to be eliminated before actually starting to parse, anyway.

Log to folder (for code file import, versions ≥ 3.27)
During parsing and diagram generation a log file will be written if this checkbox is ticked. Usually it is placed as <source_file>.log next to the imported source file. Here you may specify that all log files be directed to the specified folder instead (this is particularly helpful if the folder containing the source files is write-protected). With "." (single dot) you will direct the log files to the "current directory" (i.e., the last directory cached by Structorizer file-related actions). Button "<<" opens a directory selection dialog.

Import variable (and method) declarations (for code file import, versions ≥ 3.27)
Since version 3.26-02, pure variable declarations (i.e. without immediate initialisation) in Pascal or VB syntax are tolerated as content of Instruction elements. Consequently, code import may optionally convert variable declarations found in the source code into declaring instruction elements in order to preserve original type information. To allow so, activate this checkbox. Since version 3.31-02, this option has an additional impact on Java and Processing import: The method declaration elements described in Java code import, will only be placed into the created Includable diagram that represents an imported class, if this option is chosen.

Import source code comments (for code file import, versions ≥ 3.27)
Structorizer is now capable of importing comments from the source code and associating them to the respective diagram elements. This checkbox enables the comment import. while remaining unchecked the source code comments will be ignored.

Place configured optional keywords around conditions (for code file import, versions ≥ 3.30-07)
If this option is enabled then on importing IF statements, CASE statements, WHILE or REPEAT loops, then the conditions they contain will be decorated with the pre and post marker words currently configured in the Parser Preferences for the respective type of element. If e.g. an alternative is found in the code with a condition nTimes < 4 while the Parser markers for IF statements are specified as follows: Pre = "if" and Post = "?", then the resulting IF element will contain the text "if nTimes < 4 ?" rather than just "nTimes < 4".

Save parse tree as text file after import (for code file import, versions ≥ 3.27)
The plugins doing the code import in Structorizer rely on an open-source LALR(1) parser in combination with a language-specific grammar. A successful parsing process results in a syntax tree of the input program from which the diagram is built. This parse tree may be delivered as text file <source_file>.parsetree.txt in the source directory. Just enable this option if you are interested in having a look at it or to clarify wrong or questionable import results.

Maximum line length (for word wrapping) (for code file import, versions ≥ 3.28-11)
Specifies a number of characters per line, beyond which a word wrapping will split element text lines on import, i.e. where some atomic token (a word) of the text line would exceed the specified line length, the line will be split with a backslash at the end and the subsequent tokens will go to the next line etc. A value 0 means that no automatic word wrapping will be done.

Maximum number of imported diagrams for direct display (for code file import, versions ≥ 3.28-05)
Some code files may contain a large number of functions and procedures. If hundreds of diagrams are pushed to Arranger, this was likely to have a massive degradation impact on the GUI response time:

Though the underlying performance problems have eventually been solved with version 3.29-09, it may often be a better idea first to save all the imported diagrams to files and then to open and inspect them rather individually. Here is the option that allows you to specify the threshold number beyond which the diagrams are no longer placed in Arranger. Instead you will be offered to save the diagrams to the file system (or otherwise to discard them). The default threshold is 50. The maximum threshold for automatically accepted diagrams you can specify is 250.
Remark: This threshold does not apply for the loading of .arr or .arrz files and for dragging .nsd files into Arranger, nor does it prevent you from pushing diagrams from the Structorizer main form.

Language-specific options (for code file import)
Besides general import options (as the ones described above) there are some very language-specific subjects to customization. These are plugin-specified and not hard-coded in the Import Options dialog. In order to configure those language-specific options
1. select the interesting import language via the choice box next to the button and
2. push the button "Language-specific Options"
to see what options are available for the respective import language and configure the relevant settings. Here are the settings currently specified:
- ANSI-C99
  - Externally defined type names
    C files may depend on type names introduced by included headers, which are not available for the C parser of Structorizer. The C grammar is very sensitive to type names, however. So you should list all type names causing parser errors in this text-field (separated by commas), thus allowing the Structorizer-internal preprocessor to make them digestible for the parser.
  - Redundant pre-processor symbols or macros (versions ≥ 3.28-03)
    Often, there are certain pre-processor defines in C with mere decalaratory effect, e.g. "WINAPI" in order to tell that a function definition belogs to the WinAPI or "_opt_in" as a pointer argument prefix indicating that this argument may expect a value (rather than being used to export values) or may be set NULL ("optional"). Pre-processor symbols like these are not only completely redundant for Structorizer import but would let the parser to fail. In this option you may enumerate names of this kind to be eliminated on import. You can also name parameterized macros that should be erased by appending parentheses to the name, optionally you may put the number of arguments in the parentheses (see image).
  - Use type names and defines from WINAPI / MinGw (versions ≥ 3.28-05)
    In order to import C code that uses e.g. WINAPI defines or those from MinGw, it would usually be hard to configure any single symbol, macro, or typedef used in the code via the text fields above (in a trial and error manner). Instead you may select the respective check box to provide the parser with the minimum required classification info about all defines of the respective library at once. They will neither override the text configuration fields nor be appended to them but made available behind the scenes. Be aware that it's not the full definition of the words but just helps the parser to let the source file pass.
- COBOL
  - Import Debug lines as valid code
    Lines starting with ">>D" (free format, see below) or having an indicator symbol 'D' (in fixed format, see below) in COBOL are so-called debug lines specifying some code for debugging purposes. If the checkbox "Import Debug lines as valid code" is selected then these debug lines will be imported as active code, otherwise they will be converted to comments on import.
  - Decimal comma (instead of decimal point)
    It is possible that floating-point literals in a COBOL file use decimal commas (e.g. German locale) rather than the usual decimal points (e.g. British or US American locale). In order to make such a file pass the syntax analysis you must select this checkbox. (A file-internal directive with this regard is not recognised by Structorizer.)
  - Fixed-form format
    COBOL files may be formatted in the traditional fixed form (with fix column zones) or in free format (like most programming languages).
    
    Example for fixed-form COBOL Example for free-form COBOL
    
    The parser must know in advance what format the file is using unless a directive (" >> source format is free/fixed") in the first line of code specifies it (see example in the table above).
  - Indicator column in fixed format
    Even provided fixed format is used, there may be differences in the column where the classifying indicator characters are placed. Here you can specify at what column they appear in the specific file to be loaded (default is column 7). Just enter the column number.
  - Column of ignored text in fixed format
    As before, there is a certain column in fixed format, from which on the remaining text of the line is to be ignored by the parser. Default is 37. Enter the actual column number before importing a fixed-format COBOL file if it differs from 37.
  - Tidy up routine call chains (after PERFORM THRU)
    This mode (versions ≥ 3.32-09) addresses the import of PERFORM THRU statements. They refer to a code span between two given labels and thus generate a chain of calls to (parameterless) routines derived from so-called paragraphs or sections. Without tidying all thes calls (≥ 2) will be put in a single CALL element (which violates the rules for forming CALL elements but resembles best the original context. The multiline CALLs can easily be split by transmutation (magic wand or <Ctrl><T>) but this tends to become a real hassle if the diagrams contain dozens of such CALLs. In addition, the last call of such a chain usually refers to a redundant routine, which contains only a disabled paragraph exit. The tidying mode now allows to control this behaviour and to facilitate the generation of clear and concise diagrams:
    - tidy calls and routines (the new default) will automatically part the multi-line calls, remove calls referring to redundant routines and eliminate these routines as well;
    - tidy calls only works as before but omits the elimination of the redundant routines (they will remain unreferenced in the arrangement group of imported diagrams), some of the referencing Calls my also just be disabled rather than removed;
    - don't tidy does not do any tidying, which is less suited for easy import but may be helpful for "forensic" analysis of import problems concerning internal procedures. In this case special comments in the multi-line CALL elements will recommend the manual steps to tidy up the diagram (see COBOL import hints).
- Java SE 8 (versions ≥ 3.31)
  - Convert declarations etc. to Pascal/Structorizer style
    This option configures the degree of the syntactical conversion. With this option selected, Structorizer would convert e.g. a variable declaration like int[][] matrix; to
    var matrix: array of array of int
    otherwise it would leave it more or less as is (keeping it more recognizable for Java programmers but less usable in Structorizer). If you intend to make an attempt to execute (debug) or re-export the resulting diagrams then it is strongly recommended to select this option.
    Note that the general import option "Import variable (and method) declarations" also plays an important role for the contents of the resulting diagrams.
  - Dissect anonymous inner classes into diagrams (versions ≥ 3.32.17)
    This option procures that anonymous inner classes defined "on the fly" on instantiation like in the following example will be converted into local (sub)class diagrams. Without it, the defining code will be passed as is, i.e., as usually very long Java expression without translation, into the instantiating instruction. The option is by default enabled. Example (where a new anonymous inner class is derived from class WindowAdapter):
         this.addWindowListener(new WindowAdapter() {
               @Override
               public void windowClosing(WindowEvent evt)
               {
                   btnCancel.doClick();
                }
         });
    The class in the derived diagrams will not of course be anonymous but obtain a generic name like e.g. WindowAdapter_5d880813. The instantiating instruction in the respective element would then be:
  - Separate >> of nested type parameters to > > (versions ≥ 3.32.18)
    The parser used to mistake a sequence of closing angular brackets as they occur with nested type parameters (like in HashMap<String>, List<Integer>>) for a shift operator, which makes an import fail. The option enables a heuristic preprocessing of the source file to tell right shift operators from clusters of right angular brackets and to insert blanks between the latter ones in order to let the code pass the grammar. This works pretty well but is not guaranteed to preserve all actual operators. If this should happen then you wouldn't have a chance to import the file unless you switch off this new option (which is by default active) and to insert the blanks between the clinging angular brackets manually.
- Processing (versions ≥ 3.31)
  - Convert declarations etc. to Pascal/Structorizer style
    See Java SE 8.
  - Dissect anonymous inner classes to diagrams (versions ≥ 3.32.17)
    See Java SE 8.
  - Separate >> of nested type parameters to > > (versions ≥ 3.32.18)
    See Java SE 8.

Replace keywords on loading a diagram (NSD files)
As outlined in other sections of this manual, the interpretability of a diagram strongly depends on the parser preferences in particular, i.e. an input instruction will only be recognized if it starts with the currently configured input keyword etc. Hence, if you load a diagram created by someone else (or years before) then it is not unlikely that it adhered to a different set of parser preferences when it was saved. So the first thing you used to obtain in earlier versions was a lot of Analyser warnings. In order to sort them out you would either have to adapt the parser preferences to the diagram or vice versa. Which is both cumbersome or can even get nasty if you want to combine several diagrams of different origin by Calls.
To overcome this unpleasant situation, version 3.25-01 began to save diagram files with all non-empty parser preferences as attached attributes. This way, recent diagram files became prepared for this refactoring option:
With the checkbox enabled, Structorizer will replace all obsolete keywords throughout the diagram being loaded by the respective currently configured parser keywords, such that the diagram will automatically fit into your current context. (Actually, the diagram content will be modified to maintain its semantic equivalence.) The refactoring won't work with legacy NSD files originating from versions before 3.25-01, of course, because older .nsd files are lacking the decisive information. You would have to adapt them manually first and re-save them with a Structorizer of at least version 3.25-01 to benefit from the refactoring mechanism. (For this manual refactoring, however, you should benefit from the Find & Replace tool.)
It is highly recommended to enable this import option. Only if you are sure never to be confronted with diagrams originating from a different preferences context or if you need to see the originally used keywords then you may leave this option unchecked. You will even be warned when a diagram with differing keyword set is loaded while the refactoring mode is switched off, e.g.:

You may now choose among the three offered opportunities. If you opt for c), i.e. to postpone a measure then the first attempt to modify the diagram later on will pop up a slightly different dialog (since version 3.27):

Further import options may be added in future versions on user demand.