[PEAK] Compiling lvalues, iterators, and comprehensions
Phillip J. Eby
pje at telecommunity.com
Mon Jul 28 12:59:18 EDT 2008
In order to implement the SQL mapping stuff described in my June 19th
post, we're going to need to be able to compile lvalues, iterators,
and full comprehensions. The peak.rules.ast_builder module supports
comprehension syntax, but peak.util.assembler and peak.rules.codegen
don't have any node types or bytecode generation for them as
yet. This is a bit of a problem since in the default case of
compiling a query (i.e. executing it as-is) we'll need to be able to
implement the looping.
My current vision for how SQL translation will work is that there
will be a QueryBuilder that accepts only listcomps or genexps;
anything else as a top-level expression would be verboten. There
will also need to be an LValueBuilder for compiling the 'for' clauses.
Essentially, this is needed because the expressions used in 'for'
clauses are assignment targets ("lvalues") and can combine the use of
tuples, setitems, etc. For example:
[... for c in qz for a.b[c], d in abcd if ...]
This is a syntactically valid (if rather baroque)
expression. Personally, I think that we don't actually need to
support this full generality for purposes of query translation. It
should be sufficient to support local variables and possibly-nested
sequences thereof. However, even if we don't support that syntax for
compilation, the LValueBuilder will still need to recognize the other
forms, if only to explicitly reject them.
So, we'll need an UnpackSequence node type and a LocalAssign node
type. UnpackSequence will generate an UNPACK_SEQUENCE of the length
of its argument, and then compile the items in its argument. These
might be LocalAssign nodes, or nested UnpackSequence nodes. On the
SQL translation side, we'll detect LocalAssign nodes to assign type
information to local variables. (We'll really need to do the same
for UnpackSequence, because we might be looping over a nested query
returned by a method or function. But I might not bother with this
in the prototype version.)
Compiling the loops themselves also needs to happen, to support
looping in Python. We'll need a new node type, something like
"For(iterable, assign, body)", that compiles to:
iterable
GET_ITER
L1: FOR_ITER L2
assign
body
JUMP_ABSOLUTE L1
L2: ...
Currently, BytecodeAssembler doesn't correctly support the FOR_ITER
opcode, which has complex stack effects. Specifically, it accepts
one argument, and either adds one or removes one, depending on
whether the iterator is exhausted. The modification probably won't
be too difficult, though; mainly just an extra "if" in the
"Code.jump()" method. There will need to be some tests, too, of course.
So, the next thing to implement should be to just add For(),
LocalAssign(), and UnpackSequence() node types to BytecodeAssembler
(with doc & tests, of course), and release a new version.
More information about the PEAK
mailing list