[PEAK] A quasi-interpretation code generation strategy for
PEAK-Rules
Phillip J. Eby
pje at telecommunity.com
Sun Sep 3 01:04:50 EDT 2006
I was kicking around a bit with the PEAK-Rules code today (after being away
from it for almost two months), and came up with an idea for how to manage
code generation more effectively than was possible with my previous ideas.
My original thought was that code generation would be doing the equivalent
of creating individual "switch" statements representing dispatch tree
nodes. This could potentially mean generating quite a large body of code,
to represent all possible dispatch locations, something like this:
switch type(expr1):
case X:
switch type(expr2):
...
case Y:
switch type(expr2):
...
In contrast, RuleDispatch has a short dispatching loop that walks a tree of
dictionaries, and has some code to get the next expression to be
computed. So instead of the dispatch tree being 100% code, it's 100% data.
What I realized today is that it's possible to make a 50/50 split between
code and data. I could create a lazily-expanded dispatch tree using
dictionaries, just like in RuleDispatch. But, the choice of what
expression to look up next, could be embedded in code, something like this:
node, dispatch_expr = start_node, start_expr
while not node.isleaf:
switch dispatch_expr:
case 1:
value = arg1
lookup = by_type
case 2:
value = arg2
lookup = by_type
case 3:
value = arg1 * arg2
lookup = by_value
...
node, dispatch_expr = lookup(node, value)
return dispatch_expr(*args, kw)
So, there would be cases in the switch for all of the expressions that are
used in any of the rules. This would have all the speed benefits of using
generated bytecode to do expression lookups, but the code would be much
smaller and would only need to be regenerated when new rule expressions
were added; any other changes could in principle be made by just updating
the dispatch tree. In addition, lazy expansion would be possible, just
like in RuleDispatch.
All in all, this seems like it would be the shortest route to getting
predicate dispatch up and running in PEAK-Rules, as it requires only one
code-generation strategy, essentially creating a single specialized version
of the dispatch loop for each generic function.
In principle, this strategy should be able to match the performance of even
the type-tuple-caching system, at least with some tweaks to automatically
cache entries for not-quite-matched types. For some functions, this
strategy should even *exceed* the performance of type-tuple caching, since
some argument types may not be very selective compared to say, equality
testing. So, once I have this approach up and running, I could do some
performance analysis and see if I can possibly get away with not having the
type-tuple implementation at all, but simply bootstrapping the core using a
simplified version of this strategy.
The principal disadvantage to this strategy is that it won't lend itself at
all well to say, optimization via PyPy or Psyco. The computed gotos are
bad enough (as I understand it, they'll cause Psyco to crash at the
moment), but the use of an external data structure to "interpret" would
likely give the optimizers fits. So, such use cases will likely need an
alternative code-generation strategy, but that's what PEAK-Rules is
supposed to be good at anyway. (That is, allowing alternative
implementations of key features.)
More information about the PEAK
mailing list