[PEAK] Partitioning strategies for generic functions

Sat Jul 15 22:29:05 EDT 2006

It occurs to me today that there might be a way to make the code generation 
for generic functions in PEAK-Rules be more flexible, without creating a 
need to do lots of complex customization.

Currently, RuleDispatch uses a single generation strategy, of creating 
lazily-expanded, interpreted dispatch nodes.  PEAK-Rules has two planned 
basic strategies: type-tuple caching and fully-expanded bytecode trees 
(although only type-tuple caching is currently implemented).

But these are far from the only possibilities.  I've been thinking, for 
example, that it would be handy to be able to do a type-tuple caching 
lookup that results in specialized dispatch sub-functions.  Or a 
tuple-cache system that uses other expressions besides argument types.  And 
so on.

Until now, these ideas have been somewhat of a pipe-dream, because I 
couldn't see any way to unify them in an extensible fashion.  They seemed 
to require some degree of hard-coding to work.  But now I see that it's 
possible to create a kind of extensible "chain of strategies" that would be 
extensible by creating new strategy objects.

The basic idea is that a generic function consists of a dispatch tree that 
recursively partitions the function's methods in order to eliminate all 
inapplicable methods using the smallest possible number of tests.  And 
different approaches to partitioning have different performance 
characteristics.

For example, a raw type-tuple caching approach is extremely efficient for 
functions that are called with a non-uniform distribution of types, because 
as soon as all possible type combinations have been seen, dispatching is 
just creating a tuple and looking it up in a dictionary, then invoking the 
result.  However, inline generation of if-then bytecode can beat the heck 
out of type-tuple caching if the function is relatively simple and does 
more value checking than type checking, because it doesn't spend time 
creating a tuple or invoking another layer of functions.

So, the best possible performance for a generic function can be achieved by 
making an appropriate choice of partitioning strategies, and this can 
potentially be expressed as a strategy *chain*, where each strategy 
determines how code is generated at a particular level.  For example, let's 
say we have strategy types LookupExpr(), IfThenTree(), InterpretedTree(), 
and CombineMethods().  The simple PEAK-Rules implementation that currently 
exists could be described (in super-simplified form) as something like:

     LookupExpr("type(arg1),type(arg2),...", CombineMethods(), cache=True)

That is, the code generation that PEAK-Rules does now is to compute an 
expression based on the types of the arguments, and look that tuple up in a 
dictionary to get a method-combined function.  If the tuple isn't found, 
it's computed and then cached.

Good so far.  But what if the generic function is something like an 
operation to get a peak.web "view" of an object?  One of the arguments is a 
name.  So we could have a more sophisticated lookup:

     LookupExpr("type(arg1),type(arg2),name", CombineMethods(), cache=True)

So here we look up a tuple that includes the name.  So far so good.  Except 
that it's a web app, so we've just created the opportunity for some jerk to 
keep hitting randomly generated names so we'll use up all our memory 
caching. :(  So we need to do it a bit differently:

     LookupExpr(
         "type(arg1),type(arg2)",
         LookupExpr("name", CombineMethods()), cache=True
     )

Now, we only cache the type tuples, and use a nested dictionary for the 
names, but we don't grow the cache when a lookup misses; we just take a 
slower path in that case.

Each strategy has to embody a few different behaviors.  It needs to be able 
to generate what we might call "lookup code" and "launch code".  The lookup 
code is the code that looks something up when executing the strategy, and 
the launch code is the code that's used to *invoke* that strategy from a 
parent strategy.  So CombineMethods() would have to know how to generate 
code that calls the combined method, and LookupExpr would have to know how 
to generate both kinds of code.

In the "launch" case, the LookupExpr simply assumes that the lookup table 
is on the stack, and it generates code to compute the lookup expression and 
perform the lookup.  In the "lookup" case, the LookupExpr has to also 
generate the code to put the lookup table on the stack in the first place.

The importance of this overall idea is not so much that it can be used to 
customize the indexing approach of individual functions, as that it allows 
code generation to be more thoroughly decoupled from the base "engine" 
being used.  My original conception of PEAK-Rules was that it would use 
engine types to implement these types of implementation distinctions.  But 
now, it seems to me that just using a composable code generation system 
might allow the system to use just one "engine" type, whose job is just to 
co-ordinate the indexes required by the code generation strategy, and 
manage things like automatically regenerating code when the function's 
rules change.

And the most important thing about *that* is that it gives me a lot better 
sense of when the core framework is truly complete.  When the code 
generation is completely pluggable and composable, and I have the primitive 
code-generation strategies available, then I can consider the core 
framework to be finished.  From then on out, everything else is just 
optimizations or heuristics to allow better strategy chains to be selected 
automatically.  It'll still be possible and sensible to manually define 
strategy chains (or even implement custom generation strategies) in some 
cases, but it won't be *necessary*.

Another cool side-effect is that the core's bootstrapping process may 
become simpler.  The core generic function implementation needs to be able 
to "use itself", which leads to some hairy bits in the current PEAK-Rules 
implementation of type-tuple caching.  If you were to add methods to a 
generic function that's used to implement adding methods to generic 
functions, for example, it would still need to be able to "use itself".

So, a benefit of the pluggable code-generation would be that we could 
initially generate very simple code for these functions.  And then, later, 
we can upgrade the code generation strategy to do more when it's 
needed.  With a few special tricks to prevent against re-entry, this 
problem can thus be solved *once*, rather than needing new hacks for each 
"meta-generic function" (generic function that implements generic functions).

Thus, it seems like I should refactor the core type-tuple implementation to 
use this approach, at least once enough of the expression support library 
(ast_builder and friends) is available to make it practical.