[PEAK] A quasi-interpretation code generation strategy for PEAK-Rules

Sun Sep 3 01:04:50 EDT 2006

I was kicking around a bit with the PEAK-Rules code today (after being away 
from it for almost two months), and came up with an idea for how to manage 
code generation more effectively than was possible with my previous ideas.

My original thought was that code generation would be doing the equivalent 
of creating individual "switch" statements representing dispatch tree 
nodes.  This could potentially mean generating quite a large body of code, 
to represent all possible dispatch locations, something like this:

      switch type(expr1):
          case X:
              switch type(expr2):
                  ...
          case Y:
              switch type(expr2):
                  ...

In contrast, RuleDispatch has a short dispatching loop that walks a tree of 
dictionaries, and has some code to get the next expression to be 
computed.  So instead of the dispatch tree being 100% code, it's 100% data.

What I realized today is that it's possible to make a 50/50 split between 
code and data.  I could create a lazily-expanded dispatch tree using 
dictionaries, just like in RuleDispatch.  But, the choice of what 
expression to look up next, could be embedded in code, something like this:

      node, dispatch_expr = start_node, start_expr
      while not node.isleaf:
          switch dispatch_expr:
              case 1:
                  value = arg1
                  lookup = by_type
              case 2:
                  value = arg2
                  lookup = by_type
              case 3:
                  value = arg1 * arg2
                  lookup = by_value
              ...
          node, dispatch_expr = lookup(node, value)
      return dispatch_expr(*args, kw)

So, there would be cases in the switch for all of the expressions that are 
used in any of the rules.  This would have all the speed benefits of using 
generated bytecode to do expression lookups, but the code would be much 
smaller and would only need to be regenerated when new rule expressions 
were added; any other changes could in principle be made by just updating 
the dispatch tree.  In addition, lazy expansion would be possible, just 
like in RuleDispatch.

All in all, this seems like it would be the shortest route to getting 
predicate dispatch up and running in PEAK-Rules, as it requires only one 
code-generation strategy, essentially creating a single specialized version 
of the dispatch loop for each generic function.

In principle, this strategy should be able to match the performance of even 
the type-tuple-caching system, at least with some tweaks to automatically 
cache entries for not-quite-matched types.  For some functions, this 
strategy should even *exceed* the performance of type-tuple caching, since 
some argument types may not be very selective compared to say, equality 
testing.  So, once I have this approach up and running, I could do some 
performance analysis and see if I can possibly get away with not having the 
type-tuple implementation at all, but simply bootstrapping the core using a 
simplified version of this strategy.

The principal disadvantage to this strategy is that it won't lend itself at 
all well to say, optimization via PyPy or Psyco.  The computed gotos are 
bad enough (as I understand it, they'll cause Psyco to crash at the 
moment), but the use of an external data structure to "interpret" would 
likely give the optimizers fits.  So, such use cases will likely need an 
alternative code-generation strategy, but that's what PEAK-Rules is 
supposed to be good at anyway.  (That is, allowing alternative 
implementations of key features.)