Chapter 9: Beyond the program

Story metadata

When an interactive story is released into the wild, the community takes over, cataloguing it, preserving it for posterity, and making it available via archives and databases. To help streamline this work, authors are strongly encouraged to provide metadata, such as the title of the work and the name of the author, in a machine-readable format inside the story file.

Dialog strives to comply with the Treaty of Babel, which is an initiative to establish a common standard for interactive-fiction metadata. The information is supplied via a special set of predicates, that are queried at compile-time.

Each published story should at the very least include a unique identifier—called the IFID—to distinguish it from other works. Different versions of the same story (e.g. bug fix releases) should be marked with the same IFID. To declare the IFID, use the following predicate:

(story ifid)XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX
[Copy to clipboard]

where each X is an uppercase hexadecimal digit. There's an online tool on the Dialog website that lets you generate such declarations with fresh IFID numbers. You may also use any program capable of generating a universally unique identifier (UUID), such as the uuidgen command under Linux (available in the Debian package uuid-runtime).

The Dialog compiler will complain if the story contains more than one hundred lines of source code and no IFID is provided. The standard library is also included in the line count, so it's really only trivial test programs like the examples provided so far in this manual that are exempt from the warning.

A story author may also want to specify one or more of the following:

(story title)

Put the title of the story in the body of this rule definition.

(story author)

The full name of the author, or a pseudonym, should go here.

(story blurb)

Put a brief, selling description of the story here.

(story noun)

The default story noun is “An interactive fiction”.

(story release $)

Specify the release number in the parameter; the rule body should be empty.

The story blurb is allowed to contain paragraph breaks, (par), but otherwise these rule bodies should consist entirely of plain, static text. The predicates are queried at compile-time, and the text is extracted and stored in the relevant locations in the output files. You may also query these predicates from your own code; the standard library invokes several of them while printing the banner, for instance.

When compiling to the zblorb file format, it is possible to include cover art with your story. This is not part of the Dialog language itself, but is an extra service provided by the Dialog compiler. Specify the name of a PNG file and, optionally, a short textual description of the image, using the commandline options -c and -a respectively. The cover image should be no larger than 1200 x 1200 pixels in size, and preferably square.

The raw z8 format has limited support for metadata, so if you select this output format, only the IFID and release number will be included in standard locations in the file. The other predicates are of course still accessible from within the program.

Two elements of metadata are supplied automatically by the compiler: The compiler version string and the serial number. These can be printed from within the program (and you are strongly encouraged to do so as part of the story banner), by querying the following built-in predicates:

(serial number)

(compiler version)

The serial number is the compilation date in YYMMDD format.

Fatal runtime errors

Attempts to violate the constraints imposed on dynamic predicates will result in fatal runtime errors.

Such errors can also occur if one of the heaps (main, auxiliary, and long-term) is ever exhausted. The heaps are three relatively large memory arrays that the Dialog runtime system uses to keep track of local variables, lists, activation records, choice points and other transient data structures. When one of these arrays fills up—which could happen at any time, in the middle of any operation—there's not much Dialog can do, except abandon all hope and re-initialize itself.

For the Z-machine backend, the main heap, auxiliary heap, and long-term heap occupy 1000, 500, and 500 words respectively by default, and this should be more than enough for most interactive stories. The size of each area can be adjusted by passing commandline options to the compiler. A built-in predicate, (display memory statistics), prints the peak usage of each memory area so far. During debugging and testing, you may wish to invoke this predicate just before quitting, as it will tell you how close you are to the limits.

The main rationale for throwing runtime errors when an invalid (now) operation is attempted, instead of merely failing, is that the compiler can do a better job of optimizing the code if it can assume that now-expressions always succeed.

A fatal runtime error will reset the Dialog evaluation state, clear the heaps, and restart the program from the (error $ entry point) predicate. The parameter is a small integer, representing the nature of the error:

1: Heap space exhausted.

2: Auxiliary heap space exhausted.

3: Type error: Expected object.

4: Type error: Expected bound value.

5: Invalid dynamic operation.

6: Long-term heap space exhausted.

After a fatal error, the game world could be in an inconsistent state, and there's not much one can do except print an error message and quit. Or is there? The standard library attempts to bring the game back to a known state via the undo facility of the Z-machine.

Some notes on performance

Both the Dialog language and its compiler have been designed with runtime performance in mind. In particular, the language lends itself well to static analysis, and the compiler performs a thorough global analysis of the program in order to choose the ideal way to represent each predicate. Often, what looks like a lengthy search through a series of rule heads can compile down to a simple property lookup or a bunch of comparison instructions.

Execution speed

The programmer can lay the foundation for a well-optimized program, by following two important design principles.

The first principle is: Always dispatch on the first parameter. Whenever a bound value is used to select among many different rule definitions, try to phrase the name of the predicate so that the bound value is the first parameter. That's because the Dialog compiler considers the parameters in left-to-right order when generating accelerated code for rule lookups. An example of a carefully named predicate is (from $Room go $Direction to $Target) from the standard library, which is always queried with a known $Room.

The second principle is: Make tail calls. The last statement of a rule body is said to be in tail position. Making a query in tail position is cheaper than making it elsewhere. Thus, the following implementation:

(say it with $Obj)My hovercraft is full of (name $Obj).
(stash $Obj)(say it with $Obj) (now) ($Obj is #in #hovercraft)
[Copy to clipboard]

is not as efficient as this one:

(say it with $Obj)My hovercraft is full of (name $Obj).
(stash $Obj)(now) ($Obj is #in #hovercraft) (say it with $Obj)
[Copy to clipboard]

In the first version, the code for (stash $) needs to create an activation record, and set things up so that execution can resume at the (now) statement, with $Obj still bound to the correct value, after the inner query succeeds. In the second version, there is no need for an activation record: The (now) statement is handled locally, and then control is simply passed to (say it with $Obj).

If the final statement of a rule is a disjunction or an if-statement, then the last statement inside every branch is also in tail position.

It is especially fruitful to place recursive calls in tail position, as we can then avoid creating an activation record for every step in the recursion.

Memory footprint

Per-object variables have a relatively large memory footprint. If you would like your game to be playable on vintage hardware, try to minimize the number of per-object variables used by your design, especially if they will remain unset for most objects.

For instance, looking back at the (#troll wields #axe) example from the chapter on dynamic predicates, if the troll and the player are the only characters that wield weapons, it would be much better to use a pair of global variables:

(global variable (troll wields $))
(global variable (player wields $))
[Copy to clipboard]

If desired, wrapper predicates could be defined for querying or updating those variables as though they were per-object variables:

(#troll wields $Weapon)
(troll wields $Weapon)
(#player wields $Weapon)
(player wields $Weapon)
(now #troll wields $Weapon)
(now) (troll wields $Weapon)
(now #player wields $Weapon)
(now) (player wields $Weapon)
[Copy to clipboard]

To conserve heap memory, use backtracking whenever possible. Identify places in the code where you can implement loops with backtracking instead of recursion. For instance, to iterate over a list and print something for each element, use exhaust:

(exhaust) {
*($X is one of $List)
(report on $X)
}
[Copy to clipboard]

Tail-call optimization is backend-dependent. This means that infinite loops must be implemented using (repeat forever). If they are implemented using recursion, some memory might leak with every iteration, and the program will eventually crash.

Asymptotic complexity of per-object flags

Per-object flags are implemented in one of two ways. Most flags are stored in some array-like structure indexed by object number, which means that checking or updating the flag is a constant-time operation. Some flags are also tracked by a separate data structure on the side: A single-linked list of all objects for which the flag is currently set.

The linked list is enabled for any flag predicate that the program might query with an unbound parameter, as this allows Dialog to loop efficiently over every object for which the flag is set. An example from the standard library is ($ is to be considered), which is used during path finding. Such per-object flags can also be checked or set in constant time, but clearing them is linear in the number of objects having the flag set, because it is necessary to traverse the list at runtime in order to unlink the object in question.

The linked list is also enabled for flags that the program might clear for every object, e.g. (now) ~($ is to be considered). That operation is always linear in the number of objects that have the flag set.

Limitations and the future of Dialog

The support for numerical computation in Dialog is quite minimal at the moment. A future version of Dialog might include more built-in predicates, and an extended numerical range. However, this must be balanced against the stated design goal of a small, elegant language.

Dialog is designed to emit compact Z-code, and will be able to cram a substantial amount of content into the good old Z-machine. Nevertheless, there is an upper limit, and a backend for the Glulx platform should eventually be added to support larger stories.

Closures could be added to the language to support callbacks. There are a handful of places in the standard library where this would make the code substantially cleaner and easier to extend from the outside.

There is currently no way to provide a fallback if the interpreter fails to print a particular unicode character. The missing character will come out as a question mark instead, but this is decidedly ugly. A future version of the language may introduce functionality for dealing with unsupported characters.

The (uppercase) built-in predicate is currently limited to English letters (A–Z). It should at least be upgraded to support the so called default extra characters in the ZSCII character set, which cover the needs of several European languages. Support for case-conversion of arbitrary unicode characters is probably more trouble than it's worth.