S T R U C T O R I Z E R - User Guide
Features > File I/O API

File I/O Fundamentals

A file is a resource administered by the operating system (OS), situated in the file system and identified there by a file path. A program that is to work with some file will have to observe the following mandatory phases:

  1. Opening of the file for the intended access type (read / write);
  2. Access to the content according to the requested access type;
  3. Closing of the file as soon as access is completed.

More precisely, it is by no means certain that the opening attempt will succeed such that a failing must always be taken into account. So it's worth to remember the following abstract algorithmic schema for dealing with files inside a program:

Fundamental algorithm template for working with files

The yellow elements are the auxiliary instructions and tests dealing with the resource acquisition and release, the green element symbolizes the actual file processing, the red element represents the error path.

Look in the Syntax page for a table of the file routines made available for Executor with release 3.26.

Opening a file

Consequently, a program or routine must first request a file from the OS for a certain purpose (reading or writing) before it can work with it. Structorizer offers three different opening functions - one for reading access (fileOpen) and two ones for writing access (fileCreate and fileAppend):

  • fileOpen requests a file identified by a path string (e.g. "documents/nice_to_have.txt") for reading data from it (i.e. as input file) and requires the file to exist and to be readable with the permissions of the user.
  • fileCreate requests a file with the given path for writing data into it (i.e. as output file). In contrast to fileOpen the file is not required to exist before but the directory must grant the user writing permissions. If a file with this path had existed then it will be emptied without previous warning.
  • fileAppend requests a file (may exist or not) for appending text to its end (i.e. as output file), which requires writing permissions, of course.

Any of the three opening routines returns an integer value, which in case of success serves as program-internal identifier and file handle for all the access operations you may perform with the opened file. Numbers greater than zero are valid file handles whereas numbers less than or equal to zero signal that the open attempt failed:

  • 0, -1: unspecific IO error
  • -2: file not found (in case of fileOpen)
  • -3: insufficient permissions

You should always test whether you obtained a valid handle by the applied opening function! If you obtained a positive number (i.e. a valid file handle) then you may apply the appropriate file-related functions or procedures, always providing the file handle as first argument. Any of these subroutines are illegal if the file handle is 0 or negative or if it was not obtained by an opening function, if the kind of access doesn't match or if the associated file has already been closed inbetween.

Closing a file

As soon a s you don't need to write or read to/from a file any longer you should make sure to close the file using procedure fileClose (with the file handle as argument). This releases the file as resource and - in case of a file opened for writing - flushes the associated buffer, ensures the file consistency in the file system, and thus makes the file available for other processes and applications. Only after having closed the file you may be sure that the data are persistently stored in the file system.

Once you have closed a file, the handle value gets stale, i.e. it cannot be re-used for file operations. Even if you reopen a file that you have used before you will obtain a new, different handle. The procedure fileClose must not be applied a second time to a file aleady closed.

Reading from a file

You can only read from a file if it had been opened by means of fileOpen before and you must use the file handle obtained from fileOpen as routine argument. These are the available reading functions:

  • fileReadChar will return the next character of the file (including a blank or newline character).
  • fileReadInt will return an integer value if and only if the next token in the file is an integer literal, otherwise it will raise an error.
  • fileReadDouble will return a floating-point number if and only if the next token in the file is a number literal (be it integral or not), otherwise it will raise an error.
  • fileRead may return:
    • an integral number if an integer literal (e.g. -17) follows at the current reading position;
    • a floating point number if a floating-point literal (e.g. 3.6e17) follows at the current reading position;
    • an array of simple-type elements if a comma-separated sequence of primitive-type literals, enclosed in curly braces, (e.g. {0, 25, foo, "text without commas", 6.9}) follows at the current reading position;
    • a string consisting of the content of a quoted character sequence (e.g. "This text, however, might contain commas, but no escaped quote") following at the current reading position;
    • a string comprising the next character sequence not interrupted by a blank (e.g. foo) in any other case.
  • fileReadLine will always return a string comprising the (remainder of the) current line up to but not including the next newline character. The newline character will be consumed, though.

Since reading beyond the end of the file raises an error, it will generally be a good idea to check the "end of file" property of the file being read. This is done by function fileEOF, which returns true if the file end is reached and false otherwise. Use see its use in the examples at the end of this section.

Writing to a file

You may only write to a file if you obtained a valid (i.e. positive) file handle from one these two opening functions: fileCreate or fileAppend.

  • fileWrite will write an arbitrary value just as is - without any additional separators, line feeds etc. This allows to compose arbitrary texts without Structorizer interference.
  • fileWriteLine will do the same but add a newline character.

Examples

The following examples may illustrate how to work correctly with files:

1. FileWriteDemo just writes text / values obtained via an input instruction to a file, separating all values by a space character (which is the "natural" separator for reading tokens from a file) until the user enters a single '$' sign. This will result in one long line of text in the file. The failure path is empty, but you might insert an output instruction with an error message (see example 6 below).

Demo how to write values to a file

2. FileReadDemo goes the opposite way: It opens a text file and reads tokens (substrings separated by white space) from it - one per loop cycle - tries to interpret them as numbers or by default strings and writes the interpreted values to the output one by one. Be aware that the values read may not be equivalent in number and type to the expressions you wrote into that very file. E.g. a written non-quoted string with spaces will be split to the "words" on reading the file, and each of thes "words" (tokens) will independently be checked for literal syntax, such that a written string These are 4 words will be read as four values, three of which (the 1st, 2nd, and 4th) being strings, one (the 3rd) being an integer value.

If the file contains quoted strings i.e. several words, the first of which starting with " and the last of which (within the same line!) ending with ", then this sequence will be read as one string, the quotes being dropped.

If the file contains comma-separated tokens between curly braces, e.g.

{23, 7, -9.8e4, "something"}

then at least the attempt to convert this into an array will be made.

With this respect, you may find the new built-in test functions isArray, isNumber, isString, isChar, and isBool helpful in order to be capable of making use of the values read from file.

Demo how to read data from file

3. FileReadDoubleDemo: This algorithm relies on the assumption that the text file consists of several white-space-separated character sequences inerpretable as floating point numbers, e.g.

4.5 -98.0e5 7
12   10293

It reads the values from the file as double values into an array and then writes the array elements to the output. A token not interpretable as number will abort the algorithm.

Demo for reading double values from fil into an array

4. FileReadCharDemo: This algorithm reads the input file given by the requested path character by character (including all whitespace characters otherwise ignored!) and writes both the graphical representation and the decimal code of each character to the output stream.

Demo how to read single characters from file

5. FileAppendDemo: This algorithm tries to open a text file for writing without clearing its previous content and appends the interactive user input as additional lines to its end. The user may exit the loop by leaving the text input field empty.

Demo how to use the fileAppend function

6. FileCopy: The last diagram example demonstrates how to copy a text file line per line. The cyan elements are related to the source file, the green elements refer to the target file. The red and pink elements are the error paths for the opening attempts.

Demo to copy a text file line per line

Code export

Efforts were made to enable the code generators of Structorizer to produce a more or less sensible equivalent of algorithms using the Structorizer File API routines in the target code. With some languages a relatively simple transformation could be found, for others (e.g. C++, C#, Python) a static object class "StructorizerFileAPI" emulating the Structorizer File API may have to be inserted into or attributed to the resulting code where an in-place substitution was not feasible. The Java export just adds some private static methods to the resulting class itself.

Very different types of file handles or an exception-based function concept made it utterly difficult or even impossible to concoct some halfway compatible test for success or would require to redesign the entire context of the algorithms.

The code export to the shell script languages bash and ksh had to capitulate to the completely different paradigm of handling files, where you would have to redirect standard input or output to text files. File output can be done by sporadic echo appending to the target file (echo $value >> $filepath) but input can only be done in one single loop because there is no explicit file descriptor keeping a reading position between sporadic access attempts.

Therefore, a full (100 %) semantic equivalence may not be expected, not even with high-level languages, though their file concepts are roughly comparable.

Structorizer still doesn't offer export solutions for Oberon and BASIC. Approaches may be added with later versions, though.