[TransWarp] Page Templates
Phillip J. Eby
pje at telecommunity.com
Fri Jun 20 15:54:05 EDT 2003
(Yet another rambling design brief...)
The basic architecture:
* XHTML input (should we integrate w/HTML Tidy? Is there a Python lib for
that?)
* Parse w/SOX to object tree
* Visitor+Builder pattern: recursive function walks tree and calls
functions on builder to emit code
* Generated code will assume it has some kind of input of a 'model' to
start with, and a 'write()' function to emit text.
* Builder object has method to emit 'write()' calls for constant strings;
should be written so that consecutive constants are concatenated by Python
into a single constant in an invocation of write().
* Builder has methods to "push" and "pop" model paths. So code generator
says builder.pushModel('foo') when it encounters 'model="foo"' in a tag,
then builder.popModel() on exit from the tag's nested contents.
* Code generator carries view namespace, and maintains view
stack. Referenced view objects are looked up and adapted to a code
generation interface, then get the current XHTML node object and builder
passed in. The code generation interface could be shared by the main code
generator, since it takes the same inputs (a node, a view namespace stack,
and a builder).
* Tags should probably provide convenience methods to find "pattern"
sub-nodes.
* The builder should provide the ability to start and end a nested function
definition, in case one wants to turn a "pattern" into a callable function.
* The builder needs to provide namespace management for parameters and
temporaries.
This is beginning to sound as though I should create Module, Class, and
Function builders, that work atop an IndentedStream for output. Each would
maintain a 'symbol' table of currently active variable names, and perhaps
have the ability to look up names from other scopes. Classes would always
consider their containing Module to be their next namespace level, and
Functions would consider their next-outer Function or Module to be their
parent namespace. I suppose we don't actually need Class or Module for
page templates, but they'd be handy for other PEAK code-generating efforts.
The page template code builder would be based on Function, adding all the
convenience methods for outputting strings to write, pushing/popping model
items, etc.
We need "text" and "list" views to be able to do anything to start
with. That gives us the equivalent of early DTML's "var" and
"in". Everything else will appear later. It needs to be "dirt simple" to
write a code generator for a view, so we need to give the builders some
very high-level methods to manage code generation, beyond just managing
string constants. Ideally, something like this:
<ul model="toc" view="list">
<li pattern="listItem"><span model="title" view="text">Item title goes
here</span>
<ul model="children" view="list">
<li pattern="listItem">
<span model="title" view="text">Subtitle goes here</span>
</li>
</ul>
</li>
<ul>
Would translate as something like:
write('<ul>')
for tmpVar1 in model['toc']:
write('<li>')
write(quote(tmpVar1['title'].getValue()))
write('<ul>')
for tmpVar2 in tmpVar1['children']:
write('<li>')
write(quote(tmpVar2['title'].getValue()))
write('</li>')
del tmpVar2
write('</ul></li>')
del tmpVar1
write('</ul>')
The 'write' function, of course, will probably be some list's "append"
method, rather than anything written in Python. And all the string
constants above would be longer, since they'd contain the original whitespace.
Doing this example also shows that "text" may need to be smart about
removing useless '<span>' tags, or else we need "text" and "replace"
variants. Heck, maybe we also need a "structure" variant that doesn't
HTML-quote the content that it puts in.
It also illustrates why namespace management by the code builder would be
critical to easy code generation. I suppose we could use name prefixes to
distinguish generated variables (tmpVar1) from framework variables (write,
quote). But that seems to lead to severe ugliness in short order, like
Pyrex's '__pyx_v1' names.
To work around this, it seems useful to conceive of 'Symbol' objects that
one can create and register with a scope. For example, a Symbol object
would be created to represent the 'write' function. Anybody generating
code that needs to issue 'write', would need to import that Symbol from
somewhere, and then use it with the code builder. The Symbol would know
how to generate code to initialize itself, and it would know how it
preferred to be named. But the namespace would know how it *actually*
would be named in that space, or how it was named in an enclosing,
accessible space.
I'm not sure how to handle shadowing in nested namespaces, however. A
nested function with a 'write' parameter cannot access the 'write' of its
caller, for example. However, this probably isn't worth worrying about,
because the write parameter probably wants to be a fresh symbol in each
function anyway. Usage-specific scope builders will carry these symbols as
outward attributes. Thus, if I create a new 'PageTemplateFunction' scope
object, it will have attributes representing its parameter symbols.
Yes, that would work. And "above" the Module namespace should be a
Builtins, that contains Python built-in symbols. This means that the code
generator could be prohibited from shadowing builtins at any nesting level,
which is probably not a bad idea. Imagine trying to generate code and not
being able to access builtins like len()!
Struggling beneath the surface of these ideas, trying to get out, is a way
to write Python in Python. That is, a syntax for generating Python code
that can be cleanly expressed in Python itself, such that you could
mechanically translate from a snippet of Python to a snippet of "meta
Python". Unfortunately, every syntax I start to invent seems hopelessly
awkward or LISP-ish. Perhaps I'll think of something better later.
In the meantime, the concepts worked out so far sound doable for
implementing the basics. As we create more advanced views like form
widgets, we'll no doubt see what other issues arise. Actually, it seems
like it would be a good idea to look at some form widgets *first*, and see
what we find out. For example, one idea that pops to mind is that just
doing a simple dropdown selection widget seems to require *two* models: one
for the field value being rendered, and one for the collection of possible
values. The latter is metadata relative to the former. How should we
handle that?
Actually, the first answer that comes to mind is that it's an issue for the
model-side protocol. If models always return other models, and models have
metadata interfaces, then it's actually pretty simple to do that and many
other "widgety" things. Reserving parts of the model namespace for things
like menus and navigation seems feasible as well.
More information about the PEAK
mailing list