[PEAK] Partitioning strategies for generic functions
Phillip J. Eby
pje at telecommunity.com
Sat Jul 15 22:29:05 EDT 2006
It occurs to me today that there might be a way to make the code generation
for generic functions in PEAK-Rules be more flexible, without creating a
need to do lots of complex customization.
Currently, RuleDispatch uses a single generation strategy, of creating
lazily-expanded, interpreted dispatch nodes. PEAK-Rules has two planned
basic strategies: type-tuple caching and fully-expanded bytecode trees
(although only type-tuple caching is currently implemented).
But these are far from the only possibilities. I've been thinking, for
example, that it would be handy to be able to do a type-tuple caching
lookup that results in specialized dispatch sub-functions. Or a
tuple-cache system that uses other expressions besides argument types. And
so on.
Until now, these ideas have been somewhat of a pipe-dream, because I
couldn't see any way to unify them in an extensible fashion. They seemed
to require some degree of hard-coding to work. But now I see that it's
possible to create a kind of extensible "chain of strategies" that would be
extensible by creating new strategy objects.
The basic idea is that a generic function consists of a dispatch tree that
recursively partitions the function's methods in order to eliminate all
inapplicable methods using the smallest possible number of tests. And
different approaches to partitioning have different performance
characteristics.
For example, a raw type-tuple caching approach is extremely efficient for
functions that are called with a non-uniform distribution of types, because
as soon as all possible type combinations have been seen, dispatching is
just creating a tuple and looking it up in a dictionary, then invoking the
result. However, inline generation of if-then bytecode can beat the heck
out of type-tuple caching if the function is relatively simple and does
more value checking than type checking, because it doesn't spend time
creating a tuple or invoking another layer of functions.
So, the best possible performance for a generic function can be achieved by
making an appropriate choice of partitioning strategies, and this can
potentially be expressed as a strategy *chain*, where each strategy
determines how code is generated at a particular level. For example, let's
say we have strategy types LookupExpr(), IfThenTree(), InterpretedTree(),
and CombineMethods(). The simple PEAK-Rules implementation that currently
exists could be described (in super-simplified form) as something like:
LookupExpr("type(arg1),type(arg2),...", CombineMethods(), cache=True)
That is, the code generation that PEAK-Rules does now is to compute an
expression based on the types of the arguments, and look that tuple up in a
dictionary to get a method-combined function. If the tuple isn't found,
it's computed and then cached.
Good so far. But what if the generic function is something like an
operation to get a peak.web "view" of an object? One of the arguments is a
name. So we could have a more sophisticated lookup:
LookupExpr("type(arg1),type(arg2),name", CombineMethods(), cache=True)
So here we look up a tuple that includes the name. So far so good. Except
that it's a web app, so we've just created the opportunity for some jerk to
keep hitting randomly generated names so we'll use up all our memory
caching. :( So we need to do it a bit differently:
LookupExpr(
"type(arg1),type(arg2)",
LookupExpr("name", CombineMethods()), cache=True
)
Now, we only cache the type tuples, and use a nested dictionary for the
names, but we don't grow the cache when a lookup misses; we just take a
slower path in that case.
Each strategy has to embody a few different behaviors. It needs to be able
to generate what we might call "lookup code" and "launch code". The lookup
code is the code that looks something up when executing the strategy, and
the launch code is the code that's used to *invoke* that strategy from a
parent strategy. So CombineMethods() would have to know how to generate
code that calls the combined method, and LookupExpr would have to know how
to generate both kinds of code.
In the "launch" case, the LookupExpr simply assumes that the lookup table
is on the stack, and it generates code to compute the lookup expression and
perform the lookup. In the "lookup" case, the LookupExpr has to also
generate the code to put the lookup table on the stack in the first place.
The importance of this overall idea is not so much that it can be used to
customize the indexing approach of individual functions, as that it allows
code generation to be more thoroughly decoupled from the base "engine"
being used. My original conception of PEAK-Rules was that it would use
engine types to implement these types of implementation distinctions. But
now, it seems to me that just using a composable code generation system
might allow the system to use just one "engine" type, whose job is just to
co-ordinate the indexes required by the code generation strategy, and
manage things like automatically regenerating code when the function's
rules change.
And the most important thing about *that* is that it gives me a lot better
sense of when the core framework is truly complete. When the code
generation is completely pluggable and composable, and I have the primitive
code-generation strategies available, then I can consider the core
framework to be finished. From then on out, everything else is just
optimizations or heuristics to allow better strategy chains to be selected
automatically. It'll still be possible and sensible to manually define
strategy chains (or even implement custom generation strategies) in some
cases, but it won't be *necessary*.
Another cool side-effect is that the core's bootstrapping process may
become simpler. The core generic function implementation needs to be able
to "use itself", which leads to some hairy bits in the current PEAK-Rules
implementation of type-tuple caching. If you were to add methods to a
generic function that's used to implement adding methods to generic
functions, for example, it would still need to be able to "use itself".
So, a benefit of the pluggable code-generation would be that we could
initially generate very simple code for these functions. And then, later,
we can upgrade the code generation strategy to do more when it's
needed. With a few special tricks to prevent against re-entry, this
problem can thus be solved *once*, rather than needing new hacks for each
"meta-generic function" (generic function that implements generic functions).
Thus, it seems like I should refactor the core type-tuple implementation to
use this approach, at least once enough of the expression support library
(ast_builder and friends) is available to make it practical.
More information about the PEAK
mailing list