
Jostraca Design Note:
Units
Version:
1
Author:
Richard Rodger
In version 0.3.1, templates are parsed in one pass, and the CodeWriter is created during
this pass. In effect, two independent tasks are carried out together in a dependent fashion.
For future flexibility, the creation of the CodeWriter should occur separate form the parsing
of the template. This allows the definition of a canonical CodeWriter structure composed
of CodeWriter
Units
that is, structural blocks which have a set of properties that determine
how they are to be converted into the source code of a CodeWriter.
The units which define a CodeWriter are:
type: text, script, expression ... more as needed
section: name of section
origin: source file and line numbers for debugging
... more as needed
The units are presented in a list which is processed once from start to finish. The units are
converted to source code and the resulting source code is appended to the contents of
each section. The sections are then inserted into the CodeWriter format as at present.
This has the advantage that the existing CodeWriter formats do not need not to be
changed. The CodeWriter continues to access the Properties of the generation instance
as normal.
The list of units is present in memory as an abstract data structure, but should also be
provided for external use via some interface: through an API or standard external format.
This is useful for componentisation and for debugging: the unit list reconstructed as a
template is in fact the canonical form of the template after directive processing (replaces,
includes, etc.).
The following terms are defined: a Unit is a block of text to used for creating the
CodeWriter, under some predefined transformations. An Element is a structural part of a
template and maps to one or more units. Thus template parsing now consists of the
creation of an element tree, and CodeWriter creation consists of a transformation to the
list of units produced by the element tree.
The phrase "element tree" is used deliberately as the structure of template elements must
be capable of representing more complex forms than at present. At present there is no
concept of "scope" in a template that is, a region where certain directives or other effects
are active. It will be useful to have the concept of the scope: the provides better control of
included files and can be used to limit the effect of a replacement. Other uses for scope
are certainly possible as the concept is intended to be quite general.
A template is seen as tree of elements, where contained scopes are sub-trees. It will be
necessary to examine this concept more fully in a separate design note. At the moment
the only important consideration is that there is an assumed mapping from element trees
to unit lists the element tree is "flattened" into the list by a series of transformations, and,
possibly, certain element types map directly to certain unit types.
Assuming a list of units, we can use the existing implementation of CodeWriter
construction much as it is. As an interim, evolutionary approach, we can develop the
definition of units and various interfaces to them, by using a naïve transformation of the
existing template parsing implementation, so that a list of "naïve" elements can be
transformed into a list of units. Later, the more developed concept of element trees can be

introduced.
The current concepts of element transformation must be rejected and replaced with the
concept of CodeWriter-creating transformations on units. Therefore the properties:
lang.StringEscapeTransform,
lang.TemplateTextTransforms,
lang.TemplateInsertTransforms,
lang.TemplateScriptTransforms,
lang.TemplateExpressionTransforms,
lang.TemplateDirectiveTransforms,
are now essentially meaningless. Instead, each unit type has its own list of transforms,
which render the appropriate CodeWriter source code.
The types of unit should be dynamically defined to avoid constraining extensibility. Thus
the following property structure is proposed:
lang.CodeWriterTransform.unit-type
= <class-list-of-transforms>
where unit-type is one of "text", "script", "expression", .... etc. as needed. These Property
names are not predefined and are resolved dynamically at CodeWriter creation time. If no
transforms are defined, the textual contents of the unit is used verbatim.
The type of transforms to be used are the standard ones already known: string escapes,
insert methods wrappers, etc. But also envisaged are new transforms to encode the unit
source details as comments in the CodeWriter source, or transforms to provide (as yet
undetermined) debugging or logging code.
The concept of the Generation Instance Properties is now more important: all CodeWriter
creation and execution processes should have access to the same global set of
properties, determined at generation time by the standard property definition hierarchy.
This global, flat set of properties should be independent from other types of property such
as directive or template attributes which have been scoped to certain elements. The
generation instance properties are not seen as useful for semantic control of the
generated code, rather for operational control of the generation process.