[PEAK] Generic Function API enhancement ideas
Phillip J. Eby
pje at telecommunity.com
Fri Dec 31 01:46:48 EST 2004
Convenience APIs for Generic Functions
======================================
As I've started playing around with using generic functions for things that
involve more than type and value tests, it seems to me that there are some
things that really need to be added to the current API. For example,
yesterday I saw some examples in C# of using test-driven development to
refine an algorithm, and I decided to take a whack at simply transcribing
the requirements directly into a generic function. The original code
computed bowling scores, and this is what I came up with:
import dispatch
[dispatch.generic()]
def score(rolls,frame=1):
"""Return score for bowling game, given list of a player's rolls"""
[score.when("frame<11 and rolls and rolls[0]==10")]
def score_strike(rolls,frame):
return sum(rolls[:3]) + score(rolls[1:],frame+1)
[score.when("frame<11 and rolls and rolls[0]<10 and sum(rolls[:2])==10")]
def score_spare(rolls,frame):
return sum(rolls[:3]) + score(rolls[2:],frame+1)
[score.when("frame<11 and rolls")]
def score_open(rolls,frame):
return sum(rolls[:2]) + score(rolls[2:],frame+1)
[score.when("not rolls or frame==11")]
def scoring_done(rolls,frame):
return 0
assert score([0]*20)==0
assert score([3]*20)==60
assert score([4,6,5,3]+[0]*16)==23
assert score([10,5,3,2,1]+[0]*14)==29
assert score([10]*20)==300
assert score([10,4,6]*5+[10])==200
Pseudo-Arguments/"Macros"
-------------------------
As you can see, the code suffers quite a bit from repetition, as the
"frame<11 and rolls" condition (or its inverse) appear repeatedly. Here's
what I'm thinking I'd like the rules to look like instead:
[dispatch.generic(
finished = "not rolls or frame>10",
strike = "rolls[0]==10",
spare = "not strike and sum(rolls[:2])==10"
)]
def score(rolls,frame=1):
"""Return score for bowling game, given list of a player's rolls"""
[score.when("not finished and strike")]
def score_strike(rolls,frame):
return sum(rolls[:3]) + score(rolls[1:],frame+1)
[score.when("not finished and spare")]
def score_spare(rolls,frame):
return sum(rolls[:3]) + score(rolls[2:],frame+1)
[score.when("not finished and not strike and not spare")]
def score_open(rolls,frame):
return sum(rolls[:2]) + score(rolls[2:],frame+1)
[score.when("finished")]
def scoring_done(rolls,frame):
return 0
In other words, I'd like to be able to create "pseudo-arguments" for a
generic function to represent commonly-used intermediate values or
conditions that the individual methods can then use in their rules. This
would help reduce repetition in complex rules like these, as well as allow
useful abstractions. I'm thinking that this could also be allowed on the
'.when()' decorators as well, except that there the definition would be
local to that individual rule. So, you could initially introduce
abstractions on an individual rule, and then move them up to the generic
function if you used them often.
I'm thinking that there could maybe be a Macro facility, e.g.::
finished = Macro("not rolls or frame>10")
And then you could use it in an expression if it was present in your
'locals()'. The idea is that a Macro found in an expression just gets
replaced with its expansion, as though it had been textually included in
the expression using it.
Expression Binding
------------------
A related (but different) idea is expression binding. Consider this
snippet from dispatch.predicates::
[expressionSignature.when(
# matches 'isinstance(expr,Const)'
"expr in Call and expr.function==isinstance"
" and len(expr.argexprs)==2 and expr.argexprs[1] in Const"
)]
def convertIsInstanceToClassTest(expr,test):
typecheck = _tupleToOrTest(expr.argexprs[1].value)
if not test.truth:
typecheck = NotTest(typecheck)
return Signature([(expr.argexprs[0],typecheck)])
Notice that this code computes 'expr.argexprs[1]' twice -- not a big deal
for this routine, but for compiler-like systems these things get
progressively more complicated to compute. Part of the issue is that a
pattern matching facility is needed, such that the above could say
something like:
[expressionSignature.when(
"match(expr,Call(isinstance,inst,cls))",
inst="var()", cls="var(Const)"
)]
def convertIsInstanceToClassTest(inst,cls,expr,test):
typecheck = _tupleToOrTest(cls.value)
if not test.truth:
typecheck = NotTest(typecheck)
return Signature([(inst,typecheck)])
There are a couple of different ideas I'm presenting here, that I don't
necessarily know how to fit together. One is the pattern matching part,
the other is binding variables from a test expression and supplying them as
extra arguments to the method. For example, above I've added 'inst' and
'cls' parameters to the method at the beginning of the method list. (If
the generic function were a method of a class, these parameters would
appear *before* 'self'.)
The idea here is that any parameter names of a method that match macros
that were defined in the 'when()' or the original 'generic()', then they
will receive the value of those expressions as of the current invocation.
That idea is pretty straightforward, and I'm fairly sure I know how I'd
implement it. My main question is whether that particular API would be
confusing. Here's another (silly) example of this idea:
[dispatch.generic(
finished = "not rolls or frame>10",
strike = "rolls[0]==10",
spare = "not strike and sum(rolls[:2])==10",
next_frame = "frame + 1"
)]
def score(rolls,frame=1):
"""Return score for bowling game, given list of a player's rolls"""
[score.when("not finished and strike")]
def score_strike(next_frame,rolls,frame):
return sum(rolls[:3]) + score(rolls[1:],next_frame)
[score.when("not finished and spare")]
def score_spare(next_frame,rolls,frame):
return sum(rolls[:3]) + score(rolls[2:], next_frame)
[score.when("not finished and not strike and not spare")]
def score_open(next_frame,rolls,frame):
return sum(rolls[:2]) + score(rolls[2:], next_frame)
[score.when("finished")]
def scoring_done(rolls,frame):
return 0
Here, for no particular reason, we've made 'next_frame' a macro and then
added it as a parameter to the methods that use it. As you can see, it's
optional -- scoring_done doesn't change its signature from the base
signature. The problem I see with it is that it's hard to see what
parameter(s) are the extra ones. Maybe if they had to be a tuple...
[expressionSignature.when(
"match(expr,Call(isinstance,inst,cls))",
inst="var()", cls="var(Const)"
)]
def convertIsInstanceToClassTest( (inst,cls), expr,test):
typecheck = _tupleToOrTest(cls.value)
if not test.truth:
typecheck = NotTest(typecheck)
return Signature([(inst,typecheck)])
That seems a bit better, in that the extra args get visually grouped
together. I think I could live with that.
Pattern Matching
----------------
Anyway, the second idea here is pattern matching, represented by the
"match()" and "var()" stuff. We want a way to traverse a data structure,
applying tests to it, and saving values that we find in variables for later
use. Ideally, this should be done in such a way that we end up generating
an identical test structure for the generic function, as if we had coded it
out the long way.
In the Cecil language's dispatch system (which pioneered predicate
dispatching), one would render something like the above 'when' clause as:
method expressionSignature(expr at Call{function=f, argexprs=ae}, test)
when test(f==isinstance) and test(len(ae)==2)
and let(inst,cls) = ae and cls at Const {
...
}
which is cleaner than our rendering in some respects, but not in
others. (What I wrote is actually a strange blend of Python and Cecil; in
Cecil you'd actually say something more like:
method expressionSignature(expr at Call{function at isinstance,
argexprs=ae}, test)
when and test(len(ae)==2) and let(inst,cls) = ae and cls at Const {
...
}
Because Cecil is a prototype-based language and '@' is sort of like a cross
between 'is' and 'isinstance()' in Python.
Anyway, mostly what typing all of this is telling me is that pattern
matching needs a lot more thought. :) I want a natural, Pythonic syntax
for expressing patterns and causing variables to be bound while applying
tests and traversing structure. And preferably, it should be syntactically
valid Python, so I don't have to write another parser. :) Then, there
needs to be some way to integrate it with bindings, even if it's
implemented via something like 'matches = aPattern.match(value)', and the
variables are then retrieved via 'matches.x', where 'x' is the assigned
variable name. How about this:
isA(Call, function=isinstance,
argexprs=(let(inst=Any),let(cls=isA(Const)))
)
Ugh. Lots of Incredibly Superficial Parentheses, there. And that's with a
lot of DWIM-ming that has to be implemented by the parser (for example, the
sequence of keyword arguments *counts*).
Here's something a bit closer to what I'd *like* the syntax to look like:
Instance(Call, lambda
function=isinstance,
argexprs=Sequence(var.inst[object], var.cls[Const]): None
).matches(expr)
This is pure Python, but obfuscated to say the least, using tricks like a
special 'var' object to create new variables, and 'lambda' just to obtain
an ordered list of name/value pairs! It's also pretty DWIMmish, guessing
that classes or types are instance checks, and other non-special objects
are equality tests.
Or how about:
expr in Call(isinstance, var.inst, var.cls[Const])
This one would only work with special support from 'Call', of course, so
it's not useful for all-purpose pattern matching. Hm. What if you could
do something like:
def isCall(function, *argexprs):
"""Match a 'Call' instance"""
return Pattern(
# direct test(s)
[Call],
# (traversalSpec,subpattern)...
[("it.function", function), ("it.argexprs", argexprs)]
)
And then you could use it like this:
[expressionSignature.when(
"match(expr)",
match = isCall(isinstance, var.inst, var.cls[Const])
inst = "match(expr).inst",
cls = "match(expr).cls"
)]
Which is a bit verbose, but it's getting better. The ideal would be more like:
[expressionSignature.when(
"expr in isCall(isinstance, var.inst, var.cls[Const])"
)]
As long as there was some way for the variables to propagate up so that
'when' could mark them for binding to arguments of the method. I'm going
to have to think about the implementation a bit though. There's going to
have to be a way for signatures and expressions to expose their "bound
variables", or at the least for the AST processing to propagate the bound
variable map upwards, which could change the signatures of parsing methods
throughout the system. BUT, it would mean that our final form of "expr in
isCall(...)" would create a decision tree identical to the one created by
the original, highly-verbose spelling.
Ordered Classifiers
-------------------
One last feature that seems useful is ordered classifiers; as an example:
frame_type = Classification()
finished = frame_type("not rolls or frame>10")
strike = frame_type("rolls[0]==10")
spare = frame_type("sum(rolls[:2])==10")
open = frame_type.default()
[dispatch.generic()]
def score(rolls,frame=1):
"""Return score for bowling game, given list of a player's rolls"""
[score.when("strike")]
def score_strike(rolls,frame):
return sum(rolls[:3]) + score(rolls[1:], frame+1)
[score.when("spare")]
def score_spare(rolls,frame):
return sum(rolls[:3]) + score(rolls[2:], frame+1)
[score.when("open")]
def score_open(rolls,frame):
return sum(rolls[:2]) + score(rolls[2:], frame+1)
[score.when("finished")]
def scoring_done(rolls,frame):
return 0
The idea here is that a Classification is an *ordered* set of conditions,
that automatically ensure that a later-defined classification will only
occur if the earlier patterns all fail. Once the 'default()' method is
invoked, no further patterns can be created for that classification.
Of course, this particular example might as well be a bunch of if-elif's at
this point, but there are certainly situations where you want to create a
classification like this that can be applied to many things, sometimes in
non-obvious combinations. To be really useful, a typical classification is
going to focus on one argument; it'd then be used like a class or interface
test, e.g. 'win1 in Iconized and win2 in Maximized' where 'Iconized' and
'Maximized' are classifications in a 'WindowState' ordered classifier.
Wrap-Up
-------
I'm about to fall asleep at the keyboard, so I'm going to wrap this up
now. I'm interested in hearing your feedback, as I am worried that some of
the ideas in this thing get a little confusing as to when stuff should be
quoted and when it shouldn't, and when it doesn't matter whether you quote
it or not. :)
Anyway, my general thoughts are that we need to add:
* Macros/pseudo-arguments (or some other form of abstraction for predicates)
* Expression bindings (to allow re-use of expressions computed during dispatch)
* Pattern matching (so tools like compilers can easily match complex patterns)
* Ordered classification groups (so that it's easy to do prioritized matches)
...but the syntax of these features is largely still up for grabs, and I'm
open to suggestions.
More information about the PEAK
mailing list